Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] condor_qedit problem
- Date: Sat, 26 May 2012 11:12:13 -0500
- From: Nathan Panike <nwp@xxxxxxxxxxx>
- Subject: [Condor-users] condor_qedit problem
I keep getting tons of messages in my Userlog:
022 (129.000.000) 05/25 17:16:20 Job disconnected, attempting to reconnect
Socket between submit and execute hosts closed unexpectedly
Trying to reconnect to slot4@xxxxxxxxxxxxxxxxxxx <128.105.109.35:43267>
...
024 (129.000.000) 05/25 17:16:20 Job reconnection failed
Job not found at execution machine
Can not reconnect to slot4@xxxxxxxxxxxxxxxxxxx, rescheduling job
...
So I believe there is a problem with durga.stat.wisc.edu:
$ condor_hold 129
$ condor_qedit 129 Requirements "( TARGET.Arch == \"X86_64\" ) && ( TARGET.OpSys == \"LINUX\" ) && ( TARGET.Disk >= DiskUsage ) && ( ( TARGET.Memory * 1024 ) >= ImageSize ) && ( ( RequestMemory * 1024 ) >= ImageSize ) && ( TARGET.HasFileTransfer ) && ( Machine != \"durga.stat.wisc.edu\" )"
The Userlog dutifully reports that the Requirements have changed:
033 (129.000.000) 05/25 21:50:50 Changing job attribute Requirements from ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= DiskUsage ) && ( ( TARGET.Memory * 1024 ) >= ImageSize ) && ( ( RequestMemory * 1024 ) >= ImageSize ) && ( TARGET.HasFileTransfer ) to Requirements = ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= DiskUsage ) && ( ( TARGET.Memory * 1024 ) >= ImageSize ) && ( ( RequestMemory * 1024 ) >= ImageSize ) && ( TARGET.HasFileTransfer ) && ( Machine != "durga.stat.wisc.edu" )
But when I do:
$ condor_q -l 129 | grep Requirements
Requirements = ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= DiskUsage ) && ( ( TARGET.Memory * 1024 ) >= ImageSize ) && ( ( RequestMemory * 1024 ) >= ImageSize ) && ( TARGET.HasFileTransfer )
The requirements do not seem to have changed.
This is on the UWCS pool:
CondorVersion = "$CondorVersion: 7.6.7 Apr 24 2012 BuildID: 421363 $"
ScheddIpAddr = "<128.105.14.28:37081>"
What am I doing wrong?
Nathan Panike