Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Vanilla - jobs disappear without completing.
- Date: Mon, 15 Jun 2009 16:05:40 +0100
- From: "Rob Stevenson" <r.stevenson@xxxxxxxxxxxxxxxxxxx>
- Subject: [Condor-users] Vanilla - jobs disappear without completing.
Dear
all,
Do you know if there
is there a file size limit for condor runs? If so, is there a line that can be
added to submit files to increase this? (Something to do with "ImageSize"?). Or,
perhaps I've missed the mark completely?
Our pool is setup to
not allow preempting as we are in the vanilla universe without the ability to
compile with condor-specific libraries.
Recently I've been seeing a few
occurrences of a problem whereby some jobs that were running seem to be kicked
off their current processor and then either disappear from the queue, stay in a
permanent state of "H" or its still reported as running but all but one or two
files have been deleted from the /execute/dir[xxxx]
directory.
The jobs haven't successfully
completed, output isn't copied back to its original location and there doesn't
appear to be any log output to give me a clue.
The only thing that
seems to be common between the failures at the moment is that the jobs have all
been running for more than 4 or 5 days and all were taking up near, or
in excess of 2GB of space in the execute directory:
./execute/dir[xxxx].
Does anyone have any
ideas? Or any advice on how to increase logging so that I can catch what ever is
happening?
Many thanks to
everyone for reading,
Rob
Stevenson - Systems Administrator
IS Services
HR Wallingford Ltd
Howbery Park,
Wallingford, Oxfordshire OX10 8BA, United Kingdom
e:
r.stevenson@xxxxxxxxxxxxxxxxxxx
t: +44 (0) 1491 822472 (direct), +44 (0) 1491
835381 (switchboard)
f: +44 (0) 1491 825483 (direct), +44 (0) 1491 832233
(general)
www.hrwallingford.co.uk
**********************************************************************
HR Wallingford uses Faxes and Emails for confidential and
legally privileged business communications. They do not of
themselves create legal commitments. Disclosure to parties
other than addressees requires our specific consent. We are
not liable for unauthorised disclosures nor reliance upon them.
If you have received this message in error please advise us
immediately and destroy all copies of it.
HR Wallingford Limited
Howbery Park, Wallingford, Oxon, OX10 8BA, UK
Registered in England No. 02562099
**********************************************************************