[Iplant-api-dev] job failure
Rion Dooley
dooley at tacc.utexas.edu
Thu Jun 18 16:07:24 MST 2015
The Data Store went offline for about 10 hours over the last 24. No data could be moved and anything in flight died. That does not speak to the compute systems and whether you see data lost there, but Agave was up and humming, managing jobs, data, apps, metadata, etc across other available systems the entire time.
If you have some job ids, or at least a small timeframe to check for the jobs in question, I’m happy to double check the physical job directories for assets that may have not had the opportunity to archive. If they’re present, I’m happy to tell agave to rearchive them on your (or another user's) behalf.
System outages are a pain for everyone involved. Unfortunately, they rarely happen intentionally. The thing you have on your side is that Agave remembers everything that happened with your job along the way, so you have a mechanism for picking up the pieces and recovering if you so choose. That being said, sometimes it’s easier to resubmit than recover. Whatever I can do to help, just let me know.
—
Rion
On Jun 18, 2015, at 5:01 PM, Barthelson, Roger A - (rogerab) <rogerab at email.arizona.edu<mailto:rogerab at email.arizona.edu>> wrote:
A user has been trying to use Newbler for a while now — since last week -- and has had multiple failures that indicate that the problem is in the system. He tried today and got nothing returned, which is the same thing that happened to me with Soapdenovo, which was submitted last night and I think ran early this morning. Are there still problems with the Data Store affecting Agave jobs?
Roger
--
Roger Barthelson Ph.D.
Scientific Analyst
iPlant Collaborative
BIO5 Institute, University of Arizona
Phone: 520-977-5249
Email: rogerab at email.arizona.edu<mailto:rogerab at email.arizona.edu>
Web: www.iplantcollaborative.org/<http://www.iplantcollaborative.org/>
_______________________________________________
Iplant-api-dev Mailing List: Iplant-api-dev at iplantcollaborative.org<mailto:Iplant-api-dev at iplantcollaborative.org>
List Info and Archives: http://mail.iplantcollaborative.org/mailman/listinfo/iplant-api-dev
One-click Unsubscribe: http://mail.iplantcollaborative.org/mailman/options/iplant-api-dev/dooley%40tacc.utexas.edu?unsub=1&unsubconfirm=1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.iplantcollaborative.org/pipermail/iplant-api-dev/attachments/20150618/83cbe333/attachment.html
More information about the Iplant-api-dev
mailing list