Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Problem with Windows XP workers
- Date: Mon, 20 Dec 2004 12:43:28 +0200
- From: George Kakarontzas <gkakaron@xxxxxxxxx>
- Subject: Re: [Condor-users] Problem with Windows XP workers
Hi John,
I'm afraid this didn't solve the problem
Any more suggestions are more than welcome.
Thanks
George
> Some people have had to grant condor-vm administrator access sometimes
> under XP. .perhaps that will clear u p the problem.
> If not, remember to remove that access.
>
> JW
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
Hi all,
I have a Condor Master running on Mandrake Linux 10 with a number of
worker machines running Windows XP. All (master and workers) have condor
6.7.2 with Java Universe.
The workers are behind a firewall but I opened all the required ports
(standard and ephemeral range 29000-40000 (LOWPORT and HIGHPORT
respectively)) and they can communicate. I can't see any problems with
the firewall.
When I submit a java job from the condor master targeted to any of these
XP workers the job is assigned to the worker and the condor_q shows the
job as running. condor_status shows the worker being Claimed and Busy.
The job though never finishes and after a while though I get the
following:
On the submitter side the log says:
007 (090.000.000) 12/18 14:05:43 Shadow exception!
Can no longer talk to condor_starter <194.42.54.104:1034>
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
repeated many times.
On the worker side the Starter log says:
12/18 14:09:16 ******************************************************
12/18 14:09:16 ** condor_starter (CONDOR_STARTER) STARTING UP
12/18 14:09:16 ** C:\Condor\bin\condor_starter.exe
12/18 14:09:16 ** $CondorVersion: 6.7.2 Oct 5 2004 $
12/18 14:09:16 ** $CondorPlatform: INTEL-WINNT40 $
12/18 14:09:16 ** PID = 3504
12/18 14:09:16 ******************************************************
12/18 14:09:16 Using config file: C:\Condor\condor_config
12/18 14:09:16 Using local config files: C:\Condor/condor_config.local
12/18 14:09:16 DaemonCore: Command Socket at <194.42.54.104:3127>
12/18 14:09:16 Setting resource limits not implemented!
12/18 14:09:16 Communicating with shadow <195.251.124.82:38235>
12/18 14:09:16 Submitting machine is "gkakaron.teilar.gr"
12/18 14:09:16 Initialized IO Proxy.
12/18 14:09:19 getpeername failed so connect must have failed
12/18 14:09:48 Connect failed for 30 seconds; returning FALSE
12/18 14:09:48 FileTransfer: Unable to connect to server <195.251.124.8
2:38235>
12/18 14:09:48 ERROR "Could not initiate file transfer" at line 1404 in
file ..\src\condor_starter.V6.1\jic_shadow.C
12/18 14:09:48 ShutdownFast all jobs.
-------------------------------------
The submitting machine hostname is gkakaron.teilar.gr with IP address
195.251.124.82.
Can anyone see what's wrong here.
There is also one more thing that I don't know if its important or not.
When I setup Condor to the Windows XP machines I didn't use the
administrator account for the domain but the local administrator
account.
Thanks in advance
George