[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Quill v6.8.6 is hung up with errors in its database



A little follow-up. I dropped the Quill data from the database and
restarted my Quill daemon. It's been churning away at 100% CPU for a few
hours now rebuilding things. The Quill log has one error in it:

3/26 04:19:33 >>>>>>>> Fail: Probing Job Queue Log File <<<<<<<<
3/26 04:19:33 ++++++++ Sending schedd ad to collector ++++++++
3/26 04:19:33 ++++++++ Sent schedd ad to collector ++++++++
3/26 04:19:43 ******** Start of Probing Job Queue Log File ********
3/26 04:19:43 === Current Probing Information ===
3/26 04:19:43 fsize: 88941197           mtime: 1206476383
3/26 04:19:43 first log entry: 162 CreationTimestamp 1179870092
3/26 04:19:43 POLLING RESULT: INIT
3/26 04:28:58 [Bulk Last Data Sending ERROR] ERROR:  invalid UTF-8 byte
sequence detected near byte 0x94
CONTEXT:  COPY clusterads_str, line 1561, column val:
"((TARGET.AlteraIsDesktop == FALSE) && ((Machine ==
"pg-swph48.altera.com\uffff || Machine==\uffffpg-swph49.alt..."

3/26 04:29:03 >>>>>>>> Fail: Probing Job Queue Log File <<<<<<<<
3/26 04:29:03 ++++++++ Sending schedd ad to collector ++++++++
3/26 04:29:04 ++++++++ Sent schedd ad to collector ++++++++
3/26 04:29:14 ******** Start of Probing Job Queue Log File ********
3/26 04:29:14 === Current Probing Information ===
3/26 04:29:14 fsize: 89194742           mtime: 1206476953
3/26 04:29:14 first log entry: 162 CreationTimestamp 1179870092
3/26 04:29:14 POLLING RESULT: INIT
3/26 04:38:42 [Bulk Last Data Sending ERROR] ERROR:  invalid UTF-8 byte
sequence detected near byte 0x94
CONTEXT:  COPY clusterads_str, line 1561, column val:
"((TARGET.AlteraIsDesktop == FALSE) && ((Machine ==
"pg-swph48.altera.com\uffff || Machine==\uffffpg-swph49.alt..."

3/26 04:38:49 >>>>>>>> Fail: Probing Job Queue Log File <<<<<<<<
3/26 04:38:49 ++++++++ Sending schedd ad to collector ++++++++
3/26 04:38:49 ++++++++ Sent schedd ad to collector ++++++++
3/26 04:38:59 ******** Start of Probing Job Queue Log File ********
3/26 04:38:59 === Current Probing Information ===
3/26 04:38:59 fsize: 89445299           mtime: 1206477538
3/26 04:38:59 first log entry: 162 CreationTimestamp 1179870092
3/26 04:38:59 POLLING RESULT: INIT

I'm really not keen to dump my job_queue.log file. Erik or anyone at the
Condor team: any suggestions? I'm considering editing the offending line
in the job_queue.log file. :)

- Ian

> -----Original Message-----
> From: Ian Chesal 
> Sent: Tuesday, March 25, 2008 2:43 PM
> To: 'Condor-Users Mail List'
> Subject: Quill v6.8.6 is hung up with errors in its database
> 
> Quill seems to have gotten itself tripped up on something. 
> The following error repeats ad nauseum in my Quill log files:
> 
> 3/26 02:34:50 >>>>>>>> Fail: Probing Job Queue Log File <<<<<<<<
> 3/26 02:34:50 ++++++++ Sending schedd ad to collector ++++++++
> 3/26 02:34:50 ++++++++ Sent schedd ad to collector ++++++++
> 3/26 02:35:00 ******** Start of Probing Job Queue Log File ********
> 3/26 02:35:00 === Current Probing Information ===
> 3/26 02:35:00 fsize: 86241080           mtime: 1206470097
> 3/26 02:35:00 first log entry: 162 CreationTimestamp 1179870092
> 3/26 02:35:00 POLLING RESULT: ADDED
> 3/26 02:35:00 [SQL EXECUTION ERROR2] ERROR:  duplicate key 
> violates unique constraint "history_vertical_pkey"
> 
> 3/26 02:35:00 [SQL: INSERT INTO 
> History_Vertical(cid,pid,attr,val) SELECT cid,pid,attr,val 
> FROM (SELECT cid,pid,attr,val FROM ProcAds WHERE cid= 15971 
> and pid = 6 UNION ALL SELECT cid,6,attr,val FROM ClusterAds 
> WHERE cid=15971 AND attr NOT IN (SELECT attr FROM ProcAds 
> WHERE cid =15971 AND pid =6)) AS T WHERE attr NOT IN 
> ('ClusterId','ProcId','Owner','QDate','RemoteWallClockTime','R
> emoteUserCpu','RemoteSysCpu','ImageSize','JobStatus','JobPrio'
> ,'Cmd','CompletionDate','LastRemoteHost');]
> 3/26 02:35:00 [SQL EXECUTION ERROR2] ERROR:  duplicate key 
> violates unique constraint "history_horizontal_pkey"
> 
> 3/26 02:35:00 [SQL: INSERT INTO 
> History_Horizontal(cid,pid,"EnteredHistoryTable","Owner","QDat
> e","RemoteWallClockTime","RemoteUserCpu","RemoteSysCpu","Image
> Size","JobStatus","JobPrio","Cmd","CompletionDate","LastRemote
> Host") SELECT 15971,6, 'now', max(CASE WHEN attr='Owner' THEN 
> val ELSE NULL END), max(CASE WHEN attr='QDate' THEN cast(val 
> as integer) ELSE NULL END), max(CASE WHEN 
> attr='RemoteWallClockTime' THEN cast(val as integer) ELSE 
> NULL END), max(CASE WHEN attr='RemoteUserCpu' THEN cast(val 
> as float) ELSE NULL END), max(CASE WHEN attr='RemoteSysCpu' 
> THEN cast(val as float) ELSE NULL END), max(CASE WHEN 
> attr='ImageSize' THEN cast(val as integer) ELSE NULL END), 
> max(CASE WHEN attr='JobStatus' THEN cast(val as integer) ELSE 
> NULL END), max(CASE WHEN attr='JobPrio' THEN cast(val as 
> integer) ELSE NULL END), max(CASE WHEN attr='Cmd' THEN val 
> ELSE NULL END), max(CASE WHEN attr='CompletionDate' THEN 
> cast(val as integer) ELSE NULL END), max(CASE WHEN 
> attr='LastRemoteHost' THEN val ELSE NULL END) FROM (SELECT 
> cid,pid,attr,val FROM ProcAds WHERE cid=15971 AND pid=6 UNION 
> ALL SELECT cid,6,attr,val FROM ClusterAds WHERE cid=15971 AND 
> attr NOT IN (SELECT attr FROM procads WHERE cid =15971 AND 
> pid =6)) as T GROUP BY cid,pid;]
> 3/26 02:35:00 [SQL EXECUTION ERROR2] ERROR:  invalid UTF-8 
> byte sequence detected near byte 0x94
> 
> 3/26 02:35:00 [SQL: DELETE FROM ClusterAds_Str WHERE cid = 
> 15983 AND attr = 'AlteraRequirements'; INSERT INTO 
> ClusterAds_Str (cid, attr, val) VALUES (15983, 
> 'AlteraRequirements', '((TARGET.AlteraIsDesktop == FALSE) && 
> ((Machine == "pg-swph48.altera.com\uffff || 
> Machine==\uffffpg-swph49.altera.com")) && (TARGET.OpSys == 
> "LINUX") && (TARGET.Arch =!= UNDEFINED))');]
> 3/26 02:35:00 Set Attribute --- Error [SQL] DELETE FROM 
> ClusterAds_Str WHERE cid = 15983 AND attr = 
> 'AlteraRequirements'; INSERT INTO ClusterAds_Str (cid, attr, 
> val) VALUES (15983, 'AlteraRequirements', 
> '((TARGET.AlteraIsDesktop == FALSE) && ((Machine == 
> "pg-swph48.altera.com\uffff || 
> Machine==\uffffpg-swph49.altera.com")) && (TARGET.OpSys == 
> "LINUX") && (TARGET.Arch =!= UNDEFINED))');
> 3/26 02:35:00 [QUILL] Set Attribute --- ERROR
> 3/26 02:35:00   ERROR:  invalid UTF-8 byte sequence detected 
> near byte 0x94
> 
> 3/26 02:35:00 >>>>>>>> Fail: Probing Job Queue Log File <<<<<<<<
> 3/26 02:35:00 ++++++++ Sending schedd ad to collector ++++++++
> 3/26 02:35:00 ++++++++ Sent schedd ad to collector ++++++++
> 
> This is Condor 6.8.6 running on RHEL4. I'm just reporting 
> this in before I dump the database and let the Quill daemon 
> attempt to rebuild it.
> 
> - Ian


Confidentiality Notice.  This message may contain information that is confidential or otherwise protected from disclosure.
If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution, 
or copying of this message, or any attachments, is strictly prohibited.  If you have received this message in error, 
please advise the sender by reply e-mail, and delete the message and any attachments.  Thank you.