Hi,
we are currently experiencing a weird problem at GRIF on our
CREAM+HTCondor cluster.
The Negotiator service refuses to start. We see the in the log file
the messages below [1] ad then the daemon crashes.
The farm was draining and it is almost empty so I do not see what can
be wrong...
But I'm really not a condor expert.
Any hint?
Thanks in advance.
Cheers,
Andrea
12/19/15 14:24:34 Using config source: /etc/condor/condor_config
12/19/15 14:24:34 Using local config sources:
12/19/15 14:24:34 /etc/condor/config.d/quattor.0.global.conf
12/19/15 14:24:34 /etc/condor/config.d/quattor.1.security.conf
12/19/15 14:24:34 /etc/condor/config.d/quattor.2.params.conf
12/19/15 14:24:34 /etc/condor/config.d/quattor.3.head.conf
12/19/15 14:24:34 /etc/condor/config.d/quattor.4.groups.conf
12/19/15 14:24:34 /etc/condor/condor_config.local
12/19/15 14:24:34 config Macros = 251, Sorted = 251, StringBytes =
13200, TablesBytes = 9124
12/19/15 14:24:34 CLASSAD_CACHING is ENABLED
12/19/15 14:24:34 Daemon Log is logging: D_ALWAYS D_ERROR D_MATCH
12/19/15 14:24:34 DaemonCore: command socket at <134.158.132.147:51957>
12/19/15 14:24:34 DaemonCore: private command socket at
<134.158.132.147:51957>
12/19/15 14:24:34 WARNING: Encountered corrupt log record 198 (byte
offset 14645)
12/19/15 14:24:34 999
12/19/15 14:24:34 Lines following corrupt log record 198 (up to 3):
12/19/15 14:24:34 103 Customer.group_# # There is insufficient
memory for the Java Runtime Environment to continue_ # Cannot create
GC thread_ Out of system resources_ # An error report file with more
information is saved as: #
/var/tmp/hs_err_pid2363_log.default.heslo098@grid AccumulatedUsage 0.0
12/19/15 14:24:34 103 Customer.group_# # There is insufficient
memory for the Java Runtime Environment to continue_ # Cannot create
GC thread_ Out of system resources_ # An error report file with more
information is saved as: #
/var/tmp/hs_err_pid2363_log.default.heslo098@grid MyType "*"
12/19/15 14:24:34 103 Customer.group_# # There is insufficient
memory for the Java Runtime Environment to continue_ # Cannot create
GC thread_ Out of system resources_ # An error report file with more
information is saved as: #
/var/tmp/hs_err_pid2363_log.default.heslo098@grid
WeightedUnchargedTime 0.0
12/19/15 14:24:34 ERROR "Error: corrupt log record 198 (byte offset
14645) occurred inside closed transaction, recovery failed" at line
1293 in file /slots/02/dir_42284/userdir/src/condor_utils/classad_log.cpp
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/