Re: [Condor-users] Condor Java problem


Date: Mon, 14 Feb 2005 13:24:45 -0600
From: "David A. Kotz" <dkotz@xxxxxxxxxxxxx>
Subject: Re: [Condor-users] Condor Java problem
I've had no luck with it, I'm afraid.  I'm hoping that when we upgrade
from Debian Woody to something less archaic, the problem will just go
away.  I can dream.

- dave


On Mon, 2005-02-14 at 18:39 +0000, M Swain wrote:
> Hi David
> 
> I am suffering from the same Java problem that was discussed about a month ago: java can be seen with condor_starter -classad, but not with condor_status -java. If I run the script as suggested then the libjvm.so can't be found.
>  
> The only difference is that I am not starting condor with init. If you made any progess with this problem, or if anyone else has any further suggestions, I would very much like to hear from you.
> 
> Cheers
> 
> Martin
> 
> Thanks for the suggestion.  It does seem to be an issue with the
> environment.  Now I just have to figure out exactly what the issue might
> be.  I'm starting Condor with init, but the environment dumped from the
> condor_starter wrapper looks the same as the usual root environment,
> from which condor_starter -classad works correctly.  I'll keep digging.
> maybe I missed a difference.  For now, I'm getting the following from
> the wrapper's attempt to start Java:
> 
> 
>   Error occurred during initialization of VM
>   Unable to load native library: libjvm.so: cannot open shared object
> file: No such file or directory
> 
> 
> - dave
> 
> On Fri, 2005-01-07 at 12:15 -0600, Erik Paulson wrote:
> > On Fri, Jan 07, 2005 at 11:54:34AM -0600, David A. Kotz wrote:
> > > None of the Linux machines in our department will acknowledge having
> > > Java in their classads.  If I manually run condor_starter, it shows up.
> > > I've run the test as myself, root, and condor, and in all cases, Java is
> > > detected by condor_starter.  On the Suns, Java seems to behave
> > > correctly.  Any ideas would be appreciated.  (I believe I mentioned
> > > before that Condor hates me.)
> > > 
> > 
> > Almost certainly this means it's some sort of path problem - either in
> > the path to java or some sort of dynamic library path. The condor_starter
> > gets it's environment from the condor_startd, which gets it from the
> > condor_master. If your master is being started up by init when the
> > machine boots, your path might be kind of sparse.
> > 
> > If you can't figure it out, the easiest way to fix it is to wrap 
> > the condor_starter in a shell script and see what environment it's being
> > invoked with. rename condor_starter to be condor_starter.bin, and then
> > put this in a shell script and name it condor_starter:
> > 
> > #!/bin/sh
> > 
> > echo >>/tmp/StarterEnvDump.txt
> > echo >>"Starting the starter"
> > date >>/tmp/StarterEnvDump.txt
> > env >>/tmp/StarterEnvDump.txt
> > java -version >>/tmp/StarterEnvDump.txt
> > exec condor_starter.bin $*
> > 
> > > Below is the output from my desktop.
> > > 
> > > - dave
> > > 
> > > _________________________________
> > > 
> > > keemun $ condor_starter -classad
> > > CondorVersion = "$CondorVersion: 6.6.6 Jul 26 2004 $"
> > > IsDaemonCore = True
> > > HasFileTransfer = True
> > > HasMPI = True
> > > HasJICLocalConfig = True
> > > HasJICLocalStdin = True
> > > JavaVendor = "Sun Microsystems Inc."
> > > JavaVersion = "1.4.2"
> > > JavaMFlops = 135.990906
> > > HasJava = True
> > > 
> > > _________________________________
> > > 
> > > keemun $ condor_status -l keemun
> > > MyType = "Machine"
> > > TargetType = "Job"
> > > Name = "vm1@xxxxxxxxxxxxxxxxxxxx"
> > > Machine = "keemun.cs.utexas.edu"
> > > Rank = ((TARGET.Group =?= "CARTEL") * 3) + ((TARGET.Group =?= "PROF") *
> > > 3) + ((TARGET.Group =?= "GRAD") * 3) + ((TARGET.Group =?= "UNDER") * 2)
> > > CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
> > > COLLECTOR_HOST_STRING = "ungoliant.cs.utexas.edu"
> > > CKPT_SERVER_HOST = ungoliant.cs.utexas.edu
> > > CondorVersion = "$CondorVersion: 6.6.6 Jul 26 2004 $"
> > > CondorPlatform = "$CondorPlatform: I386-LINUX_RH72 $"
> > > VirtualMachineID = 1
> > > VirtualMemory = 525622
> > > Disk = 412332
> > > CondorLoadAvg = 0.000000
> > > LoadAvg = 0.020000
> > > KeyboardIdle = 65
> > > ConsoleIdle = 65
> > > Memory = 250
> > > Cpus = 1
> > > StartdIpAddr = "<128.83.120.125:32774>"
> > > Arch = "INTEL"
> > > OpSys = "LINUX"
> > > UidDomain = "cs.utexas.edu"
> > > FileSystemDomain = "cs.utexas.edu"
> > > Subnet = "128.83.120"
> > > HasIOProxy = TRUE
> > > TotalVirtualMemory = 1051244
> > > TotalDisk = 824664
> > > KFlops = 618930
> > > Mips = 1368
> > > LastBenchmark = 1105102361
> > > TotalLoadAvg = 0.020000
> > > TotalCondorLoadAvg = 0.000000
> > > ClockMin = 692
> > > ClockDay = 5
> > > TotalVirtualMachines = 2
> > > HasFileTransfer = TRUE
> > > HasMPI = TRUE
> > > HasJICLocalConfig = TRUE
> > > HasJICLocalStdin = TRUE
> > > HasPVM = TRUE
> > > HasRemoteSyscalls = TRUE
> > > HasCheckpointing = TRUE
> > > StarterAbilityList =
> > > "HasFileTransfer,HasMPI,HasJICLocalConfig,HasJICLocalStdin,HasPVM,HasRemoteSyscalls,HasCheckpointing"
> > > CpuBusyTime = 0
> > > CpuIsBusy = FALSE
> > > State = "Owner"
> > > EnteredCurrentState = 1105111961
> > > Activity = "Idle"
> > > EnteredCurrentActivity = 1105111961
> > > Start = ((KeyboardIdle > 15 * 60) && (((LoadAvg - CondorLoadAvg) <=
> > > 0.300000) || (State != "Unclaimed" && State != "Owner")) &&
> > > ((TARGET.Project =?= "ARCHITECTURE") || (TARGET.Project =?=
> > > "FORMAL_METHODS") || (TARGET.Project =?= "AI_ROBOTICS") ||
> > > (TARGET.Project =?= "OPERATING_DISTRIBUTED_SYSTEMS") || (TARGET.Project
> > > =?= "NETWORKING_MULTIMEDIA") || (TARGET.Project =?=
> > > "PROGRAMMING_LANGUAGES") || (TARGET.Project =?= "THEORY") ||
> > > (TARGET.Project =?= "GRAPHICS_VISUALIZATION") || (TARGET.Project =?=
> > > "COMPONENT_BASED_SOFTWARE") || (TARGET.Project =?=
> > > "SCIENTIFIC_COMPUTING") || (TARGET.Project =?= "COMPUTATIONAL_BIOLOGY")
> > > || (TARGET.Project =?= "INSTRUCTIONAL") || (TARGET.Project =?= "UTGRID")
> > > || (TARGET.Project =?= "OTHER")) && (TARGET.ProjectDescription =!=
> > > UNDEFINED))
> > > Requirements = START
> > > CurrentRank = 0.000000
> > > DaemonStartTime = 1105049249
> > > UpdateSequenceNumber = 266
> > > MyAddress = "<128.83.120.125:32774>"
> > > LastHeardFrom = 1105119166
> > > UpdatesTotal = 1542
> > > UpdatesSequenced = 1539
> > > UpdatesLost = 0
> > > UpdatesHistory = "0x00000000000000000000000000000000"
> > > 
> > > 
> > > 
> > > -- 
> > > David A. Kotz <dkotz@xxxxxxxxxxxxx>
> > > 
> > > _______________________________________________
> > > Condor-users mailing list
> > > Condor-users@xxxxxxxxxxx
> > > http://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > _______________________________________________
> > Condor-users mailing list
> > Condor-users@xxxxxxxxxxx
> > http://lists.cs.wisc.edu/mailman/listinfo/condor-users
-- 
David A. Kotz <dkotz@xxxxxxxxxxxxx>


[← Prev in Thread] Current Thread [Next in Thread→]