Came across this message in the archives, and didn't see a resolution in the thread. If I had to guess, I'd say that condor doesn't realize that you're giving it a full path to java.exe. It is still trying to find it inside the execute dir. Look at this line from the log: > 7/25 16:04:52 (fd:9) (pid:3940) About to exec > C:\condor/execute\dir_3940\"C:\\Program > Files\\Java\\jre1.5.0_06\\bin\\JAVA.EXE" -Xmx247m > -classpath C:\condor/lib;C:\condor/lib/scimark2lib.jar;. - > Dchirp.config=C:\condor\execute\dir_3940\chirp.config > CondorJavaWrapper C:\condor\execute\dir_3940\jvm.start > C:\condor\execute\dir_3940\jvm.end JavaTest While your Java location appears to be set correctly, you can see it's still trying to execute C:\condor/execute\dir_3940\"C:\\Program Files\\Java\\jre1.5.0_06\\bin\\JAVA.EXE" I was having the same problem (which led me to the thread in the archives) and the problem was that condor was looking for java.exe in the execute dir. I then specified the full path using forward slashes, and it works great. You might try that. From my condor_config: JAVA = C:/Windows/System32/Java.exe From my StarterLog, before: 1/16 10:41:47 About to exec C:\condor/execute\dir_4264\JAVA.EXE ... After: 1/16 10:50:47 About to exec C:/Windows/System32/Java.exe ... Michael > From: condor-users-bounces@xxxxxxxxxxx > [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of > Wojtek Goscinski > Sent: Sunday, July 29, 2007 8:19 PM > To: condor-users@xxxxxxxxxxx > Subject: [Condor-users] ACCESS_VIOLATION under Windows > > > > > Hi All, > > I'm experiencing a problem setting up a windows box as a condor execute > node - specifically to execute java jobs. > > I have a windows box running xp sp2. It is purely set up as an execute > node. The start deamon picks successfully picks up the job and attempts to > execute it. It spawns the condor_starter - but the condor_starter seems to > crash with an exception (an ACCESS_VIOLATION). > > As you can see in log below, the starter process seems to try to launch > java, but this ends in an exception? The starter crashes immediately after > that last log. I've confirmed that java exists at the location specified > etc. > > I assume this might be some sort of windows security issue, but I'm not > sure how to debug it. The condor vm user was given rights to execute the > java directory - though i'm not sure whether this is enough. > > Any help or tips for debugging are most welcome. > > -james > > > Start Log > ------------- > > 7/25 16:04:52 (fd:3) (pid:3636) DC_AUTHENTICATE: setting sock->decode() > 7/25 16:04:52 (fd:3) (pid:3636) DC_AUTHENTICATE: allowing an empty message > for sock. > 7/25 16:04:52 (fd:3) (pid:3636) DC_AUTHENTICATE: Success. > 7/25 16:04:52 (fd:3) (pid:3636) DaemonCore: Command received via UDP from > host < 172.19.189.3:9629> > 7/25 16:04:52 (fd:3) (pid:3636) DaemonCore: received command 60011 > (DC_NOP), calling handler (handle_nop()) > 7/25 16:04:52 (fd:3) (pid:3636) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_daemon_core.V6\daemon_core.C:2743 > 7/25 16:04:52 (fd:3) (pid:3636) Calling Handler > <HandleDC_SERVICEWAITPIDS()> for Signal 60009 <DC_SERVICEWAITPIDS> > 7/25 16:04:52 (fd:3) (pid:3636) KEYCACHEX: removing session > hp-test-02:3636:1185343491:6 for <172.19.189.3:9618 > > 7/25 16:04:52 (fd:3) (pid:3636) DaemonCore: pid 3940 exited with status > -1073741819, invoking reaper 1 <reaper> > 7/25 16:04:52 (fd:3) (pid:3636) Starter pid 3940 died on signal -1073741819 > (exception ACCESS_VIOLATION) > 7/25 16:04:52 (fd:3) (pid:3636) Entering ProcFamily::hardkill > 7/25 16:04:52 (fd:3) (pid:3636) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_c++_util\killfamily.C:274 > 7/25 16:04:52 (fd:3) (pid:3636) Destroying Daemon object: > 7/25 16:04:52 (fd:3) (pid:3636) Type: 1 (any), Name: (null), Addr: < > 172.19.189.3:9611> > 7/25 16:04:52 (fd:3) (pid:3636) FullHost: (null), Host: (null), Pool: > (null), Port: -1 > 7/25 16:04:52 (fd:3) (pid:3636) IsLocal: N, IdStr: (null), Error: (null) > 7/25 16:04:52 (fd:3) (pid:3636) --- End of Daemon object info --- > 7/25 16:04:52 (fd:3) (pid:3636) ProcAPI: pid # 3940 was not found > (OpenProcess err=1308) > 7/25 16:04:52 (fd:3) (pid:3636) ProcAPI: pid # 3940 was not found > (OpenProcess err=1308) > 7/25 16:04:52 (fd:3) (pid:3636) ProcFamily: parent: 3940 family: > 7/25 16:04:52 (fd:3) (pid:3636) ProcFamily: alive_cpu_user = 0, exited_cpu > = 0, max_image = 3624k > 7/25 16:04:52 (fd:3) (pid:3636) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_c++_util\killfamily.C:475 > 7/25 16:04:52 (fd:3) (pid:3636) Attempting to remove > C:\condor\execute\dir_3940 as SuperUser (system) > 7/25 16:04:52 (fd:3) (pid:3636) Deleted ProcFamily w/ pid 3940 as parent > 7/25 16:04:52 (fd:3) (pid:3636) State change: starter exited > 7/25 16:04:52 (fd:3) (pid:3636) Changing activity: Busy -> Idle > 7/25 16:04:52 (fd:3) (pid:3636) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_daemon_core.V6\daemon_core.C:2743 > 7/25 16:04:52 (fd:3) (pid:3636) In cancel_timer(), id=66 > 7/25 16:04:52 (fd:3) (pid:3636) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_daemon_core.V6\daemon_core.C:2743 > 7/25 16:04:52 (fd:3) (pid:3636) In DaemonCore Timeout() > 7/25 16:04:52 (fd:3) (pid:3636) > > Starter Log > ---------------- > 7/25 16:04:51 (fd:8) (pid:3940) DC_AUTHENTICATE: setting sock->decode() > 7/25 16:04:51 (fd:8) (pid:3940) DC_AUTHENTICATE: allowing an empty message > for sock. > 7/25 16:04:51 (fd:8) (pid:3940) DC_AUTHENTICATE: Success. > 7/25 16:04:51 (fd:8) (pid:3940) DaemonCore: Command received via UDP from > host < 172.19.189.3:9614> > 7/25 16:04:51 (fd:8) (pid:3940) DaemonCore: received command 60011 > (DC_NOP), calling handler (handle_nop()) > 7/25 16:04:51 (fd:8) (pid:3940) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_daemon_core.V6\daemon_core.C:2743 > 7/25 16:04:51 (fd:8) (pid:3940) Calling Handler > <HandleDC_SERVICEWAITPIDS()> for Signal 60009 <DC_SERVICEWAITPIDS> > 7/25 16:04:51 (fd:8) (pid:3940) DaemonCore: tid 3300 exited with status 1, > invoking reaper 2 <FileTransfer::Reaper()> > 7/25 16:04:51 (fd:8) (pid:3940) File transfer completed successfully. > 7/25 16:04:51 (fd:6) (pid:3940) Destroying Daemon object: > 7/25 16:04:51 (fd:6) (pid:3940) Type: 1 (any), Name: (null), Addr: > <172.19.189.3:9618> > 7/25 16:04:51 (fd:6) (pid:3940) FullHost: (null), Host: (null), Pool: > (null), Port: -1 > 7/25 16:04:51 (fd:6) (pid:3940) IsLocal: N, IdStr: (null), Error: (null) > 7/25 16:04:51 (fd:6) (pid:3940) --- End of Daemon object info --- > 7/25 16:04:52 (fd:6) (pid:3940) Calling client FileTransfer handler > function. > 7/25 16:04:52 (fd:6) (pid:3940) in DaemonCore NewTimer() > 7/25 16:04:52 (fd:6) (pid:3940) > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> Timers > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> ~~~~~~ > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 7, when = 1185343492, > period = 0, handler_descrip=<deferred job start> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 6, when = 1185343551, > period = 0, handler_descrip=<dc_touch_log_file> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 3, when = 1185343731, > period = 240, handler_descrip=<self_monitor> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 1, when = 1185343791, > period = 300, handler_descrip=<check_session_cache> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 5, when = 1185344661, > period = 1170, > handler_descrip=<DaemonCore::SendAliveToParent> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 2, when = 1185345292, > period = 1801, handler_descrip=<handle_cookie_refresh> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 4, when = 1185372290, > period = 0, handler_descrip=<DaemonCore::ReInit()> > 7/25 16:04:52 (fd:6) (pid:3940) > 7/25 16:04:52 (fd:6) (pid:3940) leaving DaemonCore NewTimer, id=7 > 7/25 16:04:52 (fd:6) (pid:3940) Job 71.0 set to execute immediately > 7/25 16:04:52 (fd:6) (pid:3940) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_daemon_core.V6\daemon_core.C:2743 > 7/25 16:04:52 (fd:6) (pid:3940) PRIV_CONDOR --> PRIV_CONDOR at > ..\src\condor_daemon_core.V6\daemon_core.C:2743 > 7/25 16:04:52 (fd:6) (pid:3940) In DaemonCore Timeout() > 7/25 16:04:52 (fd:6) (pid:3940) > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> Timers > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> ~~~~~~ > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 7, when = 1185343492, > period = 0, handler_descrip=<deferred job start> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 6, when = 1185343551, > period = 0, handler_descrip=<dc_touch_log_file> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 3, when = 1185343731, > period = 240, handler_descrip=<self_monitor> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 1, when = 1185343791, > period = 300, handler_descrip=<check_session_cache> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 5, when = 1185344661, > period = 1170, > handler_descrip=<DaemonCore::SendAliveToParent> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 2, when = 1185345292, > period = 1801, handler_descrip=<handle_cookie_refresh> > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore--> id = 4, when = 1185372290, > period = 0, handler_descrip=<DaemonCore::ReInit()> > 7/25 16:04:52 (fd:6) (pid:3940) > 7/25 16:04:52 (fd:6) (pid:3940) DaemonCore: Calling handler for Timer 7 > (deferred job start) > 7/25 16:04:52 (fd:6) (pid:3940) Starting a JAVA universe job with ID: 71.0 > 7/25 16:04:52 (fd:6) (pid:3940) In OsProc::OsProc() > 7/25 16:04:52 (fd:6) (pid:3940) Main job KillSignal: 15 (Unknown) > 7/25 16:04:52 (fd:6) (pid:3940) Main job RmKillSignal: 15 (Unknown) > 7/25 16:04:52 (fd:6) (pid:3940) Main job HoldKillSignal: 15 (Unknown) > 7/25 16:04:52 (fd:6) (pid:3940) SYSAPI_GET_LOADAVG is undefined, using > default value of True > 7/25 16:04:52 (fd:6) (pid:3940) JavaProc: Cmd="C:\\Program > Files\\Java\\jre1.5.0_06\\bin\\JAVA.EXE" > 7/25 16:04:52 (fd:6) (pid:3940) JavaProc: Args=-Xmx247m -classpath > C:\condor/lib;C:\condor/lib/scimark2lib.jar;. > -Dchirp.config=C:\condor\execute\dir_3940\chirp.config > CondorJavaWrapper C:\condor\execute\dir_3940\jvm.start > C:\condor\execute\dir_3940\jvm.end JavaTest > 7/25 16:04:52 (fd:6) (pid:3940) in VanillaProc::StartJob() > 7/25 16:04:52 (fd:6) (pid:3940) in OsProc::StartJob() > 7/25 16:04:52 (fd:6) (pid:3940) IWD: C:\condor/execute\dir_3940 > 7/25 16:04:52 (fd:6) (pid:3940) get_port_range - (LOWPORT,HIGHPORT) is > (9600,9700). > 7/25 16:04:52 (fd:6) (pid:3940) TokenCache contents: > condor-reuse-vm1@. > 7/25 16:04:52 (fd:6) (pid:3940) PRIV_CONDOR --> PRIV_USER at > ..\src\condor_starter.V6.1\os_proc.C:227 > 7/25 16:04:52 (fd:7) (pid:3940) Input file: NUL > 7/25 16:04:52 (fd:8) (pid:3940) Output file: > C:\condor/execute\dir_3940\JavaTest.output.0 > 7/25 16:04:52 (fd:9) (pid:3940) Error file: > C:\condor/execute\dir_3940\JavaTest.error.0 > 7/25 16:04:52 (fd:9) (pid:3940) Doing CONDOR_begin_execution > 7/25 16:04:52 (fd:9) (pid:3940) condor_read(): nfds=0 > 7/25 16:04:52 (fd:9) (pid:3940) condor_read(): nfound=1 > 7/25 16:04:52 (fd:9) (pid:3940) condor_read(): nfds=0 > 7/25 16:04:52 (fd:9) (pid:3940) condor_read(): nfound=1 > 7/25 16:04:52 (fd:9) (pid:3940) Renice expr "10" evaluated to 10 > 7/25 16:04:52 (fd:9) (pid:3940) About to exec > C:\condor/execute\dir_3940\"C:\\Program > Files\\Java\\jre1.5.0_06\\bin\\JAVA.EXE" -Xmx247m > -classpath C:\condor/lib;C:\condor/lib/scimark2lib.jar;. - > Dchirp.config=C:\condor\execute\dir_3940\chirp.config > CondorJavaWrapper C:\condor\execute\dir_3940\jvm.start > C:\condor\execute\dir_3940\jvm.end JavaTest > 7/25 16:04:52 (fd:9) (pid:3940) Env = > _CONDOR_SCRATCH_DIR=C:\condor\execute\dir_3940 > _CONDOR_HIGHPORT=9700 _CONDOR_LOWPORT=9600 > 7/25 16:04:52 (fd:9) (pid:3940) > JOB_INHERITS_STARTER_ENVIRONMENT is undefined, using > default value of False > 7/25 16:04:52 (fd:9) (pid:3940) PRIV_USER --> PRIV_CONDOR at > ..\src\condor_starter.V6.1\os_proc.C:343 > 7/25 16:04:52 (fd:9) (pid:3940) In > DaemonCore::Create_Process(C:\condor/execute\dir_3940\"C:\\Program > Files\\Java\\jre1.5.0_06\\bin\\JAVA.EXE",...) > > > > > > _______________________________________________ > Condor-users mailing list > To unsubscribe, send a message to > condor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/condor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/condor-users/ > >
|