Dear All,
I've recently moved to Condor 7.4.2 on our central manager/submit
host running
Solaris 10 and found that the schedd seems to be taking a worrying
amount
of memory. For instance, at present there are only ~ 150 jobs in the
queue and
the schedd is taking over 900 MB. The documentation seems to suggest
that it
should only be using around 10 kB per job !! Since this has been
rising montonically
seemingly since I restarted the daemons just a few days ago I can only
assume that this is down to a leak.
The net result of this is that condor_q etc can be very slow to
respond (more than
five minutes on occasion) and it is difficult to submit more than ~
1000 jobs
at once whereas before there was no problem with 10 000 jobs. As far
as I can
see the auto-clustering is working fine although I sometimes see in
the schedd log
messages about rebuilding tables ??
Anyone else seem this on other systems ?
Any suggestions for a fix/workaround ?
regards,
-ian.
--------------------------------------------
Dr Ian C. Smith,
Advanced Research Computing (e-Science) Team,
The University of Liverpool,
Computing Services Department.
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/