[Condor-users] condor_view server running wild


Date: Fri, 4 Feb 2005 15:46:37 -0500
From: "Robert E. Parrott" <parrott@xxxxxxxxxxxxxxxx>
Subject: [Condor-users] condor_view server running wild
Hi,

We are seeing a problem with the view server in 6.6.7, where the condor_collector process run at 100% CPU utilization all the time. It seems to be functioning correctly, except that it is continually trying to resolve the localhost name it seems. This the RH9/RHEL3 version running on RHEL3 (actually Rocks 3.3.0), on a cluster with only 45 nodes.

- ltrace on the condor_collector process, after a time, reveals a large number of

strcmp("", "*")  = -1

lines one after the other, with no other library calls,

- while strace reveals the following cycle run over & over for different user@somewhere or vm1@node combinations.

open("/etc/hosts", O_RDONLY) = 6
fcntl64(6, F_GETFD) = 0
fcntl64(6, F_SETFD, FD_CLOEXEC) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=721, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000
read(6, "# \n# Do NOT Edit (generated by d"..., 4096) = 721
close(6) = 0
munmap(0xb75e8000, 4096) = 0
fcntl64(9, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
fcntl64(9, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl64(9, F_SETFL, O_RDWR) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 6
fstat64(6, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
ioctl(6, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfffa6e8) = -1 ENOTTY (Inappropriate ioctl for device)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000
write(6, "< bclee@xxxxxxxxxxxxxxxx , 10.10"..., 39) = 39
close(6) = 0
munmap(0xb75e8000, 4096) = 0
time([1107548628]) = 1107548628
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 6
fstat64(6, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
ioctl(6, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfffa688) = -1 ENOTTY (Inappropriate ioctl for device)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75e8000
write(6, "LastHeardFrom = 1107548628", 26) = 26
close(6) = 0
munmap(0xb75e8000, 4096) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0
sendto(5, "\0\0\0\0\0\0\352j\0\0\0\0\0\0\0\rAuthMethods = \"F"..., 1179, 0, {sa_family=AF_INET, sin_port=htons(9618), sin_addr=inet_addr("10.101.2.1")}, 16) = 1179
getsockname(5, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0
read(3, 0xbfffdbe0, 8) = -1 EAGAIN (Resource temporarily unavailable)
time([1107548628]) = 1107548628
time(NULL) = 1107548628
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
select(1024, [3 8 9], [], [], {7, 0}) = 1 (in [9], left {7, 0})
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV RTMIN], NULL, 8) = 0
recvfrom(9, "\0\0\0\0\0\0\352j\0\0\0\0\0\0\0\rAuthMethods = \"F"..., 60000, 0, {sa_family=AF_INET, sin_port=htons(44031), sin_addr=inet_addr("10.101.2.1")}, [16]) = 1184
getsockname(9, {sa_family=AF_INET, sin_port=htons(9618), sin_addr=inet_addr("10.101.2.1")}, [16]) = 0
fcntl64(9, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 6
connect(6, {sa_family=AF_FILE, path="/var/run/.nscd_socket"}, 110) = -1 ENOENT (No such file or directory)
close(6)




Any ideas why condor_collector is running full throttle? We


[← Prev in Thread] Current Thread [Next in Thread→]