Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Job don't run
- Date: Fri, 13 Jan 2006 16:49:26 +0200
- From: "Anton Kucherov" <anton-k@xxxxxxxxxx>
- Subject: Re: [Condor-users] Job don't run
Hi,
Strange... It all seems normal to me.
What
are your job's needs?(Disk, Memory...)
For
example now you have 256 MB if memory. Maybe your job needs more(Check its
ImageSize and DiskUsage in condor_q)?
Otherwise all the job's requirements are met.
It
could also be that you have a busted negotiator. If all the requirements of the
jobs are met in the machines' ClassAds try to restart the
negotiator.
Regards,
Anton
Kucherov
Ok,
[aryjr]# condor_status -l
MyType =
"Machine"
TargetType = "Job"
Name = "vm1@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
Machine
= " titan.solidos.quimica.ufjf.br"
Rank
= 0.000000
CpuBusy = ((LoadAvg - CondorLoadAvg) >=
0.500000)
COLLECTOR_HOST_STRING = "
titan.solidos.quimica.ufjf.br"
CondorVersion = "$CondorVersion: 6.6.10
Jun 13 2005 $"
CondorPlatform = "$CondorPlatform: I386-LINUX_RH9
$"
VirtualMachineID = 1
VirtualMemory = 0
Disk =
12679838
CondorLoadAvg = 0.000000
LoadAvg = 0.000000
KeyboardIdle =
25439870
ConsoleIdle = 25439870
Memory = 256
Cpus = 1
StartdIpAddr =
"<192.168.1.135:34283 >"
Arch
= "INTEL"
OpSys = "LINUX"
UidDomain = "solidos.quimica.ufjf.br"
FileSystemDomain
= "
solidos.quimica.ufjf.br"
Subnet = "192.168.1"
HasIOProxy =
TRUE
TotalVirtualMemory = 0
TotalDisk = 25359676
KFlops =
923118
Mips = 2535
LastBenchmark = 1137153746
TotalLoadAvg =
0.000000
TotalCondorLoadAvg = 0.000000
ClockMin = 752
ClockDay =
5
TotalVirtualMachines = 2
HasFileTransfer = TRUE
HasMPI =
TRUE
HasJICLocalConfig = TRUE
HasJICLocalStdin = TRUE
JavaVendor = "Sun
Microsystems Inc."
JavaVersion = "1.5.0"
JavaMFlops =
503.164429
HasJava = TRUE
HasPVM = TRUE
HasRemoteSyscalls =
TRUE
HasCheckpointing = TRUE
StarterAbilityList =
"HasFileTransfer,HasMPI,HasJICLocalConfig,HasJICLocalStdin,HasJava,HasPVM,HasRemoteSyscalls,HasCheckpointing"
CpuBusyTime = 0
CpuIsBusy = FALSE
State =
"Unclaimed"
EnteredCurrentState = 1137162061
Activity =
"Idle"
EnteredCurrentActivity = 1137162061
Start = ((KeyboardIdle > 15
* 60) && (((LoadAvg - CondorLoadAvg) <= 0.300000) || (State !=
"Unclaimed" && State != "Owner")))
Requirements =
START
CurrentRank = 0.000000
DaemonStartTime =
1134353912
UpdateSequenceNumber = 9444
MyAddress = "< 192.168.1.135:34283>"
LastHeardFrom
= 1137162750
UpdatesTotal = 9445
UpdatesSequenced = 9444
UpdatesLost =
0
UpdatesHistory = "0x00000000000000000000000000000000"
MyType =
"Machine"
TargetType = "Job"
Name = "vm2@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
Machine
= "
titan.solidos.quimica.ufjf.br"
Rank = 0.000000
CpuBusy = ((LoadAvg -
CondorLoadAvg) >= 0.500000)
COLLECTOR_HOST_STRING = "titan.solidos.quimica.ufjf.br
"
CondorVersion = "$CondorVersion: 6.6.10 Jun 13 2005
$"
CondorPlatform = "$CondorPlatform: I386-LINUX_RH9 $"
VirtualMachineID =
2
VirtualMemory = 0
Disk = 12679838
CondorLoadAvg = 0.000000
LoadAvg
= 0.000000
KeyboardIdle = 628
ConsoleIdle = 25439870
Memory =
256
Cpus = 1
StartdIpAddr = "<192.168.1.135:34283>"
Arch = "INTEL"
OpSys = "LINUX"
UidDomain = "solidos.quimica.ufjf.br"
FileSystemDomain
= "solidos.quimica.ufjf.br
"
Subnet = "192.168.1"
HasIOProxy = TRUE
TotalVirtualMemory =
0
TotalDisk = 25359676
KFlops = 923118
Mips = 2535
LastBenchmark =
1137153746
TotalLoadAvg = 0.000000
TotalCondorLoadAvg =
0.000000
ClockMin = 752
ClockDay = 5
TotalVirtualMachines =
2
HasFileTransfer = TRUE
HasMPI = TRUE
HasJICLocalConfig =
TRUE
HasJICLocalStdin = TRUE
JavaVendor = "Sun Microsystems
Inc."
JavaVersion = "1.5.0"
JavaMFlops = 503.164429
HasJava =
TRUE
HasPVM = TRUE
HasRemoteSyscalls = TRUE
HasCheckpointing =
TRUE
StarterAbilityList =
"HasFileTransfer,HasMPI,HasJICLocalConfig,HasJICLocalStdin,HasJava,HasPVM,HasRemoteSyscalls,HasCheckpointing"
CpuBusyTime = 0
CpuIsBusy = FALSE
State =
"Owner"
EnteredCurrentState = 1137160503
Activity =
"Idle"
EnteredCurrentActivity = 1137160503
Start = ((KeyboardIdle > 15
* 60) && (((LoadAvg - CondorLoadAvg) <= 0.300000) || (State !=
"Unclaimed" && State != "Owner")))
Requirements =
START
CurrentRank = 0.000000
DaemonStartTime =
1134353912
UpdateSequenceNumber = 9410
MyAddress = "< 192.168.1.135:34283>"
LastHeardFrom
= 1137162751
UpdatesTotal = 9411
UpdatesSequenced = 9410
UpdatesLost =
0
UpdatesHistory = "0x00000000000000000000000000000000"
[]s
Ary Junior
On 1/13/06, Anton
Kucherov <anton-k@xxxxxxxxxx>
wrote:
Hi,
All the machines are
rejected by your requirements, so none of the jobs would
start.
It would really help if you
send condor_status -l" output.
Regards,
Anton
Kucherov
Please, why my job don't run? I can run another jobs, but this stay
idle. Is a memory problem? How can I check it? See below my condor_q output
for the job.
[ti4o8-pbcn-seg-dir]# condor_q -analyze 279
--
Submitter: titan.solidos.quimica.ufjf.br : < 192.168.1.135:34282> :
titan.solidos.quimica.ufjf.br
ID
OWNER
SUBMITTED RUN_TIME ST PRI SIZE
CMD
---
279.000: Run analysis summary. Of 2
machines,
2 are rejected by your job's
requirements
0 reject your job because of
their own requirements
0 match, but are
serving users with a better priority in the
pool
0 match, match, but reject the job for
unknown reasons
0 match, but will not
currently preempt their existing job
0 are
available to run your job
Last
successful match: Thu Jan 12 09:46:52
2006
Last failed match: Thu Jan
12 23:23:37 2006
Reason for last
match failure: no match found
WARNING: Be
advised:
No resources matched request's
constraints
Check the Requirements _expression_
below:
Requirements = (Arch == "INTEL") && (OpSys == "LINUX")
&& (Disk >= DiskUsage) && ((Memory * 1024) >= ImageSize)
&& (HasFileTransfer || (TARGET.FileSystemDomain ==
MY.FileSystemDomain))
Thanks very much,
Ary
Junior
_______________________________________________
Condor-users
mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users