I should mention that, in the MasterLog on the vserv03, there a number
of these:
02/22 10:10:13 Started DaemonCore process
"/opt/condor/sbin/condor_dbmsd", pid and pgroup = 13940
02/22 10:10:13 The DBMSD (pid 13940) died due to signal 11
(Segmentation fault)
02/22 10:10:13 restarting /opt/condor/sbin/condor_dbmsd in 3600
seconds
and the DbmsdLog reports these:
02/22 10:10:13 main_init() called
02/22 10:10:13 Using Database Type = Postgres
02/22 10:10:13 Using Database IpAddress = vserv03:5432
02/22 10:10:13 Using Database Name = quill_vserv03
02/22 10:10:13 Using Database User = quillwriter
02/22 10:10:13 Connection to database 'quill_vserv03' failed.
02/22 10:10:13 FATAL: connection limit exceeded for non-superusers
02/22 10:10:13 Deallocating connection resources to database
'quill_vserv03'
02/22 10:10:13 config: unable to connect to DB--- ERROR02/22
10:10:13 ERROR "config: unable to connect to DB
" at line 133 in file ManagedDatabase.cpp
Stack dump for process 13940 at timestamp 1298369413 (14 frames)
condor_dbmsd(dprintf_dump_stack+0xb7)[0x5183f0]
condor_dbmsd(_Z18linux_sig_coredumpi+0x2c)[0x50afc8]
/lib64/libpthread.so.0[0x2b1dcd4abb10]
condor_dbmsd(_ZN11DBMSManagerD1Ev+0xbd)[0x4dce1d]
condor_dbmsd[0x4dbf16]
/lib64/libc.so.6(exit+0xe5)[0x2b1dce4cf3a5]
condor_dbmsd(__wrap_exit+0x28)[0x4f3330]
condor_dbmsd[0x516911]
condor_dbmsd(_ZN15ManagedDatabaseC1Ev+0x421)[0x4ddf35]
condor_dbmsd(_ZN11DBMSManager4initEv+0x63)[0x4dc925]
condor_dbmsd(_Z9main_initiPPc+0x2d)[0x4dbfe7]
condor_dbmsd(main+0x18df)[0x50d26b]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x2b1dce4b9994]
condor_dbmsd(__gxx_personality_v0+0x411)[0x4dbde9]
Cheers,
Santanu
Santanu Das wrote:
Dear all,
Every time I try to use condor_history, I get this:
-- Quill: quill@xxxxxxxxxxxxxxxxxxxxxxxx : <vserv03:5432> :
quill_vserv03
-- Database at <vserv03:5432> not reachable
--Failing over to the history file at /home/condorr/spool/history
instead --
Or condor_q, returns this:
-- Failed to fetch ads from db [quill_vserv03] at database server
<vserv03:5432>
-- Database not reachable or down.
- Failing over to the quill daemon --
On the box, where QUILL database is running (vserv03), I see
these in the log:
02/22 09:48:36 *** Warning: Bad Log file; skipping malformed Attr
List
02/22 09:48:36 >>>>>>>> Fail: Polling Event Log <<<<<<<<
02/22 09:48:36 ******** Start of Polling XML Log ********
02/22 09:48:36 ********* End of Polling XML Log *********
02/22 09:48:36 ++++++++ Sending Quill ad to collector ++++++++
02/22 09:48:36 ++++++++ Sent Quill ad to collector ++++++++
02/22 09:48:36 ******** Start of Polling Job Queue Log ********
02/22 09:48:36 JOB QUEUE POLLING RESULT: NO CHANGE
02/22 09:48:36 ********* End of Polling Job Queue Log *********
02/22 09:48:36 ******** Start of Polling Event Log ********
02/22 09:48:55 failed to create classad; bad expr = username =
"group_camont.camoNEW Rejects
Any idea about what's going wrong or where I start digging in?
Cheers,
Santanu
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/