Hi all,
I found the problem,
My cluster does not have a shared file system, NFS etc.
Although I had put in the job submit script ‘should_transfer_file=yes’
I had omitted ‘TransferFiles=ALWAYS’
The inclusion of this line solved the problem.
Thanks to those who provided suggestions.
Cheers,
Sandy
Computer Officer, RA Certification Manager Department of Computer Science - UWA Llandinam Building Penglais Campus Aberystwyth Ceredigion Wales - UK SY23 3DB Tel: (01970)-622433 Fax: (01970)-628536
-----Original
Message-----
Hi, I am pretty new to condor too, so not too sure if I am getting this correct. Anyway, if job is not running, shouldn’t we be looking at the Schedd and Shadow Log of the submitting machine and Start and Starter log of the remote host?
Unless your job is having problem getting a match, otherwise, if your remote host is already rejecting the job, I do not think the collector has much part to play.
Raymond Wong System Engineer DID: 7358 Pager: 98028590
-----Original
Message-----
Hi Again,
Sorry I should have said, I am running Condor version 6.6.1.
I also have looked in the Collector log and there is a message:
condor_write(): Socket closed when trying to write buffer Buf::write(): condor_write() failed
Cheers,
Sandy
Computer Officer, RA Certification Manager Department of Computer Science - UWA Llandinam Building Penglais Campus Aberystwyth Ceredigion Wales - UK SY23 3DB Tel: (01970)-622433 Fax: (01970)-628536
-----Original
Message-----
Hi,
I have set up a test cluster of two nodes, one master and one slave. I can submit a test job locally on each machine and both run to completion. If after setting the macro START to False on the master and resubmit the job it is rejected by the slave. With the message rejected by your job’s requirements.
Anyone got any suggestions on where I might begin to track this problem.
Cheers,
Sandy
Computer Officer, RA Certification Manager Department of Computer Science - UWA Llandinam Building Penglais Campus Aberystwyth Ceredigion Wales - UK SY23 3DB Tel: (01970)-622433 Fax: (01970)-628536
|