On 09/25/2015 03:13 PM, Lidie Stephen wrote:
Condor was running fine until we moved user home folders from a traditional NFS server to a Ceph-based server. Now a condor_submit gives: [lusol@condor condor]$ condor_submit submit.job Submitting job(s) ERROR: Can't open "/home/lusol/condor/out.0" with flags 01101 (Value too large for defined data type) Has anyone a clue as to what might be happening? If I specify that output and error files goto /tmp rather than the Ceph volume then condor_submit works normally.
I wonder if ceph is doing something that surprises condor_submit's pre-job-run output file checks. If you run with
condor_submit -disable your_submit_file does the job run to completion, or do we see file i/o problems later? -greg