Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Error funning jobs on hetrogenous cluster
- Date: Wed, 24 Oct 2007 10:22:26 +0200
- From: "Atle Rudshaug" <atle.rudshaug@xxxxxxxxx>
- Subject: [Condor-users] Error funning jobs on hetrogenous cluster
I have a test cluster with one Debian, one Kubuntu and one Fedora
node. I get different errors on all the nodes. I guess I need a local
executable on every node compiled for that spesific distro? Is there
some kind of requirement I can state in the submit file that can
specify distro the executable needs to run? Is there some way to send
my own libraries that my executable needs or do I have to have them on
the same path on each node? Can I have them on NFS? Guess I need to
compile them with NFS paths to lib-files in the Makefile then?
#Submit file
universe = vanilla
executable = dagoc
output = dagoc.out.$(CLUSTER).$(PROCESS)
error = dagoc.err.$(CLUSTER).$(PROCESS)
log = dagoc.log.$(CLUSTER)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = /mnt/dagocproject/dbases/TEST.db
arguments = -c -start=10 -stop=20 /mnt/dagocproject/setups/TEST_remote.sup
queue 5
What does the following error mean?
dagoc.err.102.0 and dagoc.err.102.4
------------------------------------------------------------------------------------
condor_exec.exe: symbol lookup error: condor_exec.exe: undefined
symbol: _ZSt22__uninitialized_copy_aIN9__gnu_cxx17__normal_iteratorIPKSsSt6vectorISsSaISsEEEEPSsSsET0_T_SA_S9_SaIT1_E
Here I need to compile the executable on the node that got this error.
dagoc.err.102.1
------------------------------------------------------------------------------------
condor_exec.exe: /lib/tls/i686/cmov/libc.so.6: version `GLIBC_2.4' not
found (required by condor_exec.exe)
dagoc.log.11:
------------------------------------------------------------------------------------
000 (102.000.000) 10/24 09:37:23 Job submitted from host: <xxx.247>
...
000 (102.001.000) 10/24 09:37:23 Job submitted from host: <xxx.247>
...
000 (102.002.000) 10/24 09:37:23 Job submitted from host: <xxx.247>
...
000 (102.003.000) 10/24 09:37:23 Job submitted from host: <xxx.247>
...
000 (102.004.000) 10/24 09:37:23 Job submitted from host: <xxx.247>
...
001 (102.000.000) 10/24 09:37:30 Job executing on host: <xxx.251>
...
005 (102.000.000) 10/24 09:37:32 Job terminated.
(1) Normal termination (return value 127)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
819 - Run Bytes Sent By Job
23796974 - Run Bytes Received By Job
819 - Total Bytes Sent By Job
23796974 - Total Bytes Received By Job
...
001 (102.001.000) 10/24 09:37:32 Job executing on host: <xxx.245>
...
005 (102.001.000) 10/24 09:37:32 Job terminated.
(1) Normal termination (return value 1)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
107 - Run Bytes Sent By Job
23796974 - Run Bytes Received By Job
107 - Total Bytes Sent By Job
23796974 - Total Bytes Received By Job
...
001 (102.002.000) 10/24 09:37:32 Job executing on host: <xxx.247>
...
001 (102.003.000) 10/24 09:37:34 Job executing on host: <xxx.247>
...
001 (102.004.000) 10/24 09:37:39 Job executing on host: <xxx.251>
...
005 (102.004.000) 10/24 09:37:39 Job terminated.
(1) Normal termination (return value 127)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
819 - Run Bytes Sent By Job
23796974 - Run Bytes Received By Job
819 - Total Bytes Sent By Job
23796974 - Total Bytes Received By Job
...
005 (102.002.000) 10/24 09:37:46 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:07, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:07, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
4536073 - Run Bytes Sent By Job
23796974 - Run Bytes Received By Job
4536073 - Total Bytes Sent By Job
23796974 - Total Bytes Received By Job
...
005 (102.003.000) 10/24 09:37:48 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:07, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:07, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
4536071 - Run Bytes Sent By Job
23796974 - Run Bytes Received By Job
4536071 - Total Bytes Sent By Job
23796974 - Total Bytes Received By Job
...