Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] dagman capabilities

Date: Mon, 5 Oct 2009 14:57:47 +0100
From: michael bane <michael.bane@xxxxxxxxxxxxxxxx>
Subject: [Condor-users] dagman capabilities

I was looking at 'job recovery: the rescue DAG' in the online Condormanual (2.10.6) but couldn't decide is DAGman was capable of handlingthe situation of submitting N jobs (embarrassingly parallel, say) tothe Vanilla universe (since we cannot link to the checkpointing forStandard) and then resubmitting those which are killed (eg due to pre-emption by work on the given nodes)? It talks about handling jobs thatdo not finish due to a node failure but before investing time/effort Iwished to enquire whether this included such pre-emption as outlinedabove?


thakns, M

Follow-Ups:
- Re: [Condor-users] dagman capabilities
  - From: Todd Tannenbaum

Prev by Date: Re: [Condor-users] GSI Authentication failure in condor
Next by Date: Re: [Condor-users] GSI Authentication failure in condor
Previous by thread: Re: [Condor-users] Throttling preemption
Next by thread: Re: [Condor-users] dagman capabilities
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

[Condor-users] dagman capabilities