Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Torture test

Date: Wed, 23 Jun 2004 12:19:09 +0200
From: Ralf Reinhardt <ralf.reinhardt@xxxxxxxxxxx>
Subject: [Condor-users] Torture test

Hi, I am writing a small frontend for bioninformatics tasks, which will be used by users which are rather unaware of the cluster behind it. Since the cluster (128 CPU) should work without continuous supervision, I made some torture tests with many very small jobs. The results are zombie jobs which ahve been finished successfully, but are still noted as running on their nodes, slowly blocking the whole cluster. Questions: - Can it be avoided ? - If not: Is there a better way to get the system back in sync than to remove all jobs with the forcex option?

Cheers,

Ralf

Prev by Date: Re: [Condor-users] killed jobs hang around in idle state
Next by Date: [Condor-users] User impersonation
Previous by thread: [Condor-users] CHECKPOINT-SERVER instalation
Next by thread: RE: [Condor-users] Torture test
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

[Condor-users] Torture test