Date: | Fri, 4 Feb 2005 15:46:37 -0500 |
---|---|
From: | "Robert E. Parrott" <parrott@xxxxxxxxxxxxxxxx> |
Subject: | [Condor-users] condor_view server running wild |
Hi, We are seeing a problem with the view server in 6.6.7, where the condor_collector process run at 100% CPU utilization all the time. It seems to be functioning correctly, except that it is continually trying to resolve the localhost name it seems. This the RH9/RHEL3 version running on RHEL3 (actually Rocks 3.3.0), on a cluster with only 45 nodes. - ltrace on the condor_collector process, after a time, reveals a large number of strcmp("", "*") = -1 lines one after the other, with no other library calls, - while strace reveals the following cycle run over & over for different user@somewhere or vm1@node combinations. open("/etc/hosts", O_RDONLY) = 6 fcntl64(6, F_GETFD) = 0 fcntl64(6, F_SETFD, FD_CLOEXEC) = 0 fstat64(6, {st_mode=S_IFREG|0644, st_size=721, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000 read(6, "# \n# Do NOT Edit (generated by d"..., 4096) = 721 close(6) = 0 munmap(0xb75e8000, 4096) = 0 fcntl64(9, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0 fcntl64(9, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(9, F_SETFL, O_RDWR) = 0 open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 6 fstat64(6, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0 ioctl(6, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfffa6e8) = -1 ENOTTY (Inappropriate ioctl for device) mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000 write(6, "< bclee@xxxxxxxxxxxxxxxx , 10.10"..., 39) = 39 close(6) = 0 munmap(0xb75e8000, 4096) = 0 time([1107548628]) = 1107548628 open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 6 fstat64(6, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0 ioctl(6, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfffa688) = -1 ENOTTY (Inappropriate ioctl for device) mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000 write(6, "LastHeardFrom = 1107548628", 26) = 26 close(6) = 0 munmap(0xb75e8000, 4096) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0 getsockname(5, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0 sendto(5, "\0\0\0\0\0\0\352j\0\0\0\0\0\0\0\rAuthMethods = \"F"..., 1179, 0, {sa_family=AF_INET, sin_port=htons(9618), sin_addr=inet_addr("10.101.2.1")}, 16) = 1179 getsockname(5, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0 read(3, 0xbfffdbe0, 8) = -1 EAGAIN (Resource temporarily unavailable) time([1107548628]) = 1107548628 time(NULL) = 1107548628 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 select(1024, [3 8 9], [], [], {7, 0}) = 1 (in [9], left {7, 0}) rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV RTMIN], NULL, 8) = 0 recvfrom(9, "\0\0\0\0\0\0\352j\0\0\0\0\0\0\0\rAuthMethods = \"F"..., 60000, 0, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 1184 getsockname(9, {sa_family=AF_INET, sin_port=htons(9618), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0 fcntl64(9, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 6 connect(6, {sa_family=AF_FILE, path="/var/run/.nscd_socket"}, 110) = -1 ENOENT (No such file or directory) close(6) Any ideas why condor_collector is running full throttle? We |
[← Prev in Thread] | Current Thread | [Next in Thread→] |
---|---|---|
|
Previous by Date: | Re: [Condor-users] Kerberos on Tru64, Zachary Miller |
---|---|
Next by Date: | RE: [Condor-users] Per-user Job Control, Gooding, Stephen L |
Previous by Thread: | Re: [Condor-users] Job run time limit ?, Matt Hope |
Next by Thread: | Re: [Condor-users] condor_view server running wild, Robert E. Parrott |
Indexes: | [Date] [Thread] |