[Iplant-api-dev] Problems submitting fAPI jobs

Ghiban, Cornel ghiban at cshl.edu
Wed Aug 20 08:13:11 MST 2014


Yes, I can confirm this. All jobs since yesterday after 3pm are in
PENDING state.

Thanks,
Cornel

On Wed, 2014-08-20 at 09:40 -0500, Jennewein, Douglas M wrote:
> Thanks, Rion.  The job staging failures seem to have been intermittent
> and stopped on August 13, but we’ve started noticing a different
> problem.  
> 
>  
> 
> Jobs submitted both by djennewe and bioextract (including scheduled
> retries) have been stuck in the PENDING state since about 2:00 PM
> yesterday afternoon.
> 
>  
> 
>  
> 
> From: Rion Dooley [mailto:dooley at tacc.utexas.edu] 
> Sent: Tuesday, August 12, 2014 3:31 PM
> To: Jennewein, Douglas M
> Cc: iPlant API Developers Mailing List
> Subject: Re: [Iplant-api-dev] Problems submitting fAPI jobs
> 
> 
>  
> 
> Hi Doug, 
> 
>  
> 
> 
> It’s really important that, when building your apps, you have a
> mechanism for retrying failed jobs. Often times, the underlying
> systems and networks will do weird things. That’s nothing Foundation
> can predict or avoid. While it does retry several times on it’s own,
> if a head node goes down, gets overloaded, or the network just does
> weird things while it’s retrying, the job will fail. It’s up to you,
> as a developer, to retry jobs that inexplicably fail as part of your
> normal submission workflow. 
> 
> 
>  
> 
> 
> On another note, I’ve noticed that a lot of appellations are
> submitting “test” jobs several times a hour. Not only is this a
> horrible use of public resources, it’s also ineffective because they
> don’t actually test anything other than whether the API is responsive,
> which the past few months has been well over 99% of the time. It would
> be better to ping the system directly (not recommended), or
> use Agave’s monitoring service to make this check for you. If you are
> worried about the actual API status, then you can subscribe for
> notifications from the Agave Status page, hosted by status.io at
>  http://status.agaveapi.co. The status page is guaranteed 100% uptime
> and accurately reflects current system statuses.
> 
> 
> --
> Rion
> 
> 
> 
> 
> 
>  
> 
> On Aug 12, 2014, at 3:15 PM, Jennewein, Douglas M
> <Doug.Jennewein at usd.edu> wrote:
> 
> 
> 
> 
>         I'm noticing problems submitting jobs to fAPI as djennewe (job
>         id 57821).  Job submission seems to fail with a Java io
>         exception message like "Failed to submit job 57821:
>         java.io.IOException: Error completing remote execution after:
>         "  Jobs had been working normally earlier today.
>         
>         
>          
>         
>         
>         Doug
>         
>         
>          
>         
>         
>         _______________________________________________
>         Iplant-api-dev Mailing
>         List: Iplant-api-dev at iplantcollaborative.org
>         List Info and
>         Archives: http://mail.iplantcollaborative.org/mailman/listinfo/iplant-api-dev  
>         One-click
>         Unsubscribe: http://mail.iplantcollaborative.org/mailman/options/iplant-api-dev/dooley%40tacc.utexas.edu?unsub=1&unsubconfirm=1
>         
>         
>  
> 
> 
> _______________________________________________
> Iplant-api-dev Mailing List: Iplant-api-dev at iplantcollaborative.org
> List Info and Archives: http://mail.iplantcollaborative.org/mailman/listinfo/iplant-api-dev  
> One-click Unsubscribe: http://mail.iplantcollaborative.org/mailman/options/iplant-api-dev/ghiban%40cshl.edu?unsub=1&unsubconfirm=1 




More information about the Iplant-api-dev mailing list