airavata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eroma (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRAVATA-2945) When output file transfers fail, manual files transfer process to transfer files to the gateway
Date Tue, 13 Nov 2018 21:53:00 GMT
Eroma created AIRAVATA-2945:
-------------------------------

             Summary: When output file transfers fail, manual files transfer process to transfer
files to the gateway
                 Key: AIRAVATA-2945
                 URL: https://issues.apache.org/jira/browse/AIRAVATA-2945
             Project: Airavata
          Issue Type: Task
          Components: helix implementation
    Affects Versions: 0.18
         Environment: https://staging.ultrascan.scigap.org
            Reporter: Eroma
            Assignee: Dimuthu Upeksha
             Fix For: 0.18


Due to storage server or HPC cluster being unresponsive or unavailable [1] or not writable
[2] there are failures that could happen with file transfers. 

When the file transfer level issues are happening at output staging the experiment will fail
but the job could be successfully completed and the files are available in the remote cluster
for transfer. In such case not to leave the already used SUs to waste Airavata to have a manual
process to transfer the files. Having a automated process is a possibility but deciding when
actually to run it and for which experiments could lead to unnecessary issues. Hence a process
with human intervention seem to be the more practical and error free solution here. 

 

[1]

org.apache.airavata.helix.impl.task.TaskOnFailException: Error Code : 62b6a5d4-3567-43b4-9d49-7bff20e3414d,
Task TASK_63e9db6f-def0-4cd6-b973-e90badd4fb01 failed due to Error while checking the file
/oasis/scratch/comet/us3/temp_project/airavata-workingdirs/PROCESS_0ef47841-d2f9-4120-aab4-7c711d216f99/output/analysis-results.tar
existence, java.net.UnknownHostException: comet.sdsc.edu at org.apache.airavata.helix.impl.task.AiravataTask.onFail(AiravataTask.java:130)
at org.apache.airavata.helix.impl.task.staging.OutputDataStagingTask.onRun(OutputDataStagingTask.java:187)
at org.apache.airavata.helix.impl.task.AiravataTask.onRun(AiravataTask.java:349) at org.apache.airavata.helix.core.AbstractTask.run(AbstractTask.java:92)
at org.apache.helix.task.TaskRunner.run(TaskRunner.java:71) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.airavata.agents.api.AgentException:
java.net.UnknownHostException: comet.sdsc.edu at org.apache.airavata.helix.adaptor.SSHJAgentAdaptor.doesFileExist(SSHJAgentAdaptor.java:201)
at org.apache.airavata.helix.impl.task.staging.DataStagingTask.transferFileToStorage(DataStagingTask.java:141)
at org.apache.airavata.helix.impl.task.staging.OutputDataStagingTask.onRun(OutputDataStagingTask.java:172)
... 10 more Caused by: java.net.UnknownHostException: comet.sdsc.edu at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589)
at net.schmizz.sshj.SocketClient.connect(SocketClient.java:126) at net.schmizz.sshj.SocketClient.connect(SocketClient.java:117)
at org.apache.airavata.helix.adaptor.PoolingSSHJClient.createNewSSHClient(PoolingSSHJClient.java:248)
at org.apache.airavata.helix.adaptor.PoolingSSHJClient.leaseSSHClient(PoolingSSHJClient.java:104)
at org.apache.airavata.helix.adaptor.PoolingSSHJClient.newSFTPClientWrapper(PoolingSSHJClient.java:291)
at org.apache.airavata.helix.adaptor.SSHJAgentAdaptor.doesFileExist(SSHJAgentAdaptor.java:198)
... 12 more

 

[2]

org.apache.airavata.helix.impl.task.TaskOnFailException: Error Code : 5e402d08-30e2-428b-afce-1ca84ef61036,
Task TASK_0dbeea65-22be-4ccf-a242-8f0fc10c2b1b failed due to Failed uploading the output file
to /srv/www/htdocs/uslims3/uslims3_data/9f851eae-c20a-fde4-8d47-c1322a6b910c/analysis-results.tar
from local path /tmp/PROCESS_6c944df6-f1d1-485b-9343-37373cd24f4a/temp_inputs/analysis-results.tar,
net.schmizz.sshj.xfer.scp.SCPRemoteException: Remote SCP command had error: scp: /srv/www/htdocs/uslims3/uslims3_data/9f851eae-c20a-fde4-8d47-c1322a6b910c/analysis-results.tar:
Read-only file system at org.apache.airavata.helix.impl.task.AiravataTask.onFail(AiravataTask.java:130)
at org.apache.airavata.helix.impl.task.staging.OutputDataStagingTask.onRun(OutputDataStagingTask.java:187)
at org.apache.airavata.helix.impl.task.AiravataTask.onRun(AiravataTask.java:349) at org.apache.airavata.helix.core.AbstractTask.run(AbstractTask.java:92)
at org.apache.helix.task.TaskRunner.run(TaskRunner.java:71) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.airavata.agents.api.AgentException:
net.schmizz.sshj.xfer.scp.SCPRemoteException: Remote SCP command had error: scp: /srv/www/htdocs/uslims3/uslims3_data/9f851eae-c20a-fde4-8d47-c1322a6b910c/analysis-results.tar:
Read-only file system at org.apache.airavata.helix.adaptor.SSHJAgentAdaptor.copyFileTo(SSHJAgentAdaptor.java:173)
at org.apache.airavata.helix.adaptor.SSHJStorageAdaptor.uploadFile(SSHJStorageAdaptor.java:61)
at org.apache.airavata.helix.impl.task.staging.DataStagingTask.transferFileToStorage(DataStagingTask.java:175)
at org.apache.airavata.helix.impl.task.staging.OutputDataStagingTask.onRun(OutputDataStagingTask.java:172)
... 10 more Caused by: net.schmizz.sshj.xfer.scp.SCPRemoteException: Remote SCP command had
error: scp: /srv/www/htdocs/uslims3/uslims3_data/9f851eae-c20a-fde4-8d47-c1322a6b910c/analysis-results.tar:
Read-only file system at net.schmizz.sshj.xfer.scp.SCPEngine.check(SCPEngine.java:73) at net.schmizz.sshj.xfer.scp.SCPEngine.sendMessage(SCPEngine.java:133)
at net.schmizz.sshj.xfer.scp.SCPUploadClient.sendFile(SCPUploadClient.java:97) at net.schmizz.sshj.xfer.scp.SCPUploadClient.process(SCPUploadClient.java:78)
at net.schmizz.sshj.xfer.scp.SCPUploadClient.startCopy(SCPUploadClient.java:70) at net.schmizz.sshj.xfer.scp.SCPUploadClient.copy(SCPUploadClient.java:50)
at net.schmizz.sshj.xfer.scp.SCPUploadClient.copy(SCPUploadClient.java:43) at net.schmizz.sshj.xfer.scp.SCPFileTransfer.upload(SCPFileTransfer.java:55)
at org.apache.airavata.helix.adaptor.wrapper.SCPFileTransferWrapper.upload(SCPFileTransferWrapper.java:44)
at org.apache.airavata.helix.adaptor.SSHJAgentAdaptor.copyFileTo(SSHJAgentAdaptor.java:171)
... 13 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message