Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] Job doesn't run
- Date: Tue, 18 May 2004 11:23:42 -0700
- From: "Tim Harsch" <harsch1@xxxxxxxx>
- Subject: Re: [condor-users] Job doesn't run
Alain,
I thiink you're right, the list will be too slow. Here's the output of
those commands though... I'm beginning to think a reinstall would be the
best option.
[122] dna:/home/condor/condor-6.7.0/sbin> condor_q -l
-- Submitter: dna.llnl.gov : <134.9.102.41:44798> : dna.llnl.gov
[123] dna:/home/condor/condor-6.7.0/sbin> condor_status -l
MyType = "Machine"
TargetType = "Job"
Name = "vm1@xxxxxxxxxxxx"
Machine = "dna.llnl.gov"
Rank = 0.000000
CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
COLLECTOR_HOST_STRING = "dna.llnl.gov"
CondorVersion = "$CondorVersion: 6.7.0 Apr 27 2004 $"
CondorPlatform = "$CondorPlatform: SUN4X-SOLARIS28 $"
VirtualMachineID = 1
VirtualMemory = 2034116
Disk = 487042400
CondorLoadAvg = 0.000000
LoadAvg = 0.010000
KeyboardIdle = 2373
ConsoleIdle = 3525191
Memory = 704
Cpus = 1
StartdIpAddr = "<134.9.102.41:44797>"
Arch = "SUN4u"
OpSys = "SOLARIS28"
UidDomain = "llnl.gov"
FileSystemDomain = "llnl.gov"
Subnet = "134.9.102"
HasIOProxy = TRUE
TotalVirtualMemory = 4068232
TotalDisk = 974084776
KFlops = 60234
Mips = 339
LastBenchmark = 1084898308
TotalLoadAvg = 0.010000
TotalCondorLoadAvg = 0.000000
ClockMin = 618
ClockDay = 2
TotalVirtualMachines = 2
HasFileTransfer = TRUE
HasReconnect = TRUE
HasMPI = TRUE
HasJICLocalConfig = TRUE
HasJICLocalStdin = TRUE
HasRemoteSyscalls = TRUE
HasCheckpointing = TRUE
StarterAbilityList =
"HasFileTransfer,HasReconnect,HasMPI,HasJICLocalConfig,HasJICLocalStdin,HasR
emoteSyscalls,HasCheckpointing"
CpuBusyTime = 0
CpuIsBusy = FALSE
State = "Unclaimed"
EnteredCurrentState = 1084899509
Activity = "Idle"
EnteredCurrentActivity = 1084899509
Start = ((KeyboardIdle > 15 * 60) && (((LoadAvg - CondorLoadAvg) <=
0.300000) || (State != "Unclaimed" && State != "Owner")))
Requirements = START
CurrentRank = 0.000000
DaemonStartTime = 1084898303
UpdateSequenceNumber = 8
MyAddress = "<134.9.102.41:44797>"
LastHeardFrom = 1084900713
UpdatesTotal = 9
UpdatesSequenced = 8
UpdatesLost = 0
UpdatesHistory = "0x00000000000000000000000000000000"
MyType = "Machine"
TargetType = "Job"
Name = "vm2@xxxxxxxxxxxx"
Machine = "dna.llnl.gov"
Rank = 0.000000
CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
COLLECTOR_HOST_STRING = "dna.llnl.gov"
CondorVersion = "$CondorVersion: 6.7.0 Apr 27 2004 $"
CondorPlatform = "$CondorPlatform: SUN4X-SOLARIS28 $"
VirtualMachineID = 2
VirtualMemory = 2034116
Disk = 487042400
CondorLoadAvg = 0.000000
LoadAvg = 0.000000
KeyboardIdle = 3525191
ConsoleIdle = 3525191
Memory = 704
Cpus = 1
StartdIpAddr = "<134.9.102.41:44797>"
Arch = "SUN4u"
OpSys = "SOLARIS28"
UidDomain = "llnl.gov"
FileSystemDomain = "llnl.gov"
Subnet = "134.9.102"
HasIOProxy = TRUE
TotalVirtualMemory = 4068232
TotalDisk = 974084776
KFlops = 60234
Mips = 339
LastBenchmark = 1084898308
TotalLoadAvg = 0.010000
TotalCondorLoadAvg = 0.000000
ClockMin = 618
ClockDay = 2
TotalVirtualMachines = 2
HasFileTransfer = TRUE
HasReconnect = TRUE
HasMPI = TRUE
HasJICLocalConfig = TRUE
HasJICLocalStdin = TRUE
HasRemoteSyscalls = TRUE
HasCheckpointing = TRUE
StarterAbilityList =
"HasFileTransfer,HasReconnect,HasMPI,HasJICLocalConfig,HasJICLocalStdin,HasR
emoteSyscalls,HasCheckpointing"
CpuBusyTime = 0
CpuIsBusy = FALSE
State = "Unclaimed"
EnteredCurrentState = 1084898628
Activity = "Idle"
EnteredCurrentActivity = 1084898628
Start = ((KeyboardIdle > 15 * 60) && (((LoadAvg - CondorLoadAvg) <=
0.300000) || (State != "Unclaimed" && State != "Owner")))
Requirements = START
CurrentRank = 0.000000
DaemonStartTime = 1084898303
UpdateSequenceNumber = 9
MyAddress = "<134.9.102.41:44797>"
LastHeardFrom = 1084900714
UpdatesTotal = 10
UpdatesSequenced = 9
UpdatesLost = 0
UpdatesHistory = "0x00000000000000000000000000000000"
----- Original Message -----
From: "Alain Roy" <roy@xxxxxxxxxxx>
To: <condor-users@xxxxxxxxxxx>
Sent: Tuesday, May 18, 2004 10:01 AM
Subject: Re: [condor-users] Job doesn't run
>
> >Why wasn't COLLECTOR and NEGOTIATOR in the default config? .. no
matter.
>
> It depends on the parameters you gave when you installed Condor. If you
> used "condor_configure", you need to set the "--type" parameter
> appropriately. If the documentation was confusing--our apologies! But if
> you selected different options, you would have had COLLECTOR & NEGOTIATOR
> defined.
>
> >Adding negotiator to the DAEMON_LIST got condor_q -analyze to work. I'm
> >sorry to seem dense, but I'm new to condor... What requirements could my
> >job have that is preventing it from running. Cluster 4.0, had no
universe
> >specified so is standard, with 5.0 I specified vanilla. And, what does
the
> >warning mean?
>
> You can see the requirements of the job with:
>
> condor_q -l
>
> You can see the requirements of the machines with:
>
> condor_status -l
>
> Look for the "Requirements =" in the output.
>
> In fact, share this with this list, and I can help look at it and find the
> problem.
>
> Tim, I think you're getting hung up on a series of fairly small problems,
> but it's slow to debug via this list. If you want to have a phone call,
> perhaps combined with a VNC session, I can probably get you up and working
> very quickly. If you want to do this, I'll be available on Thursday--let
me
> know in personal email.
>
> -alain
>
>
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>
>
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>