[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Redhat Linux 7.3, kernel 2.4.18 - kernel dump on dual Xeon servers



On Tue, Oct 25, 2005 at 02:24:21PM -0400, Howard, Timothy G      S157 wrote:
> We are getting Linux Kernel dumps on several of our cluster nodes (dual
> Xeon servers running Redhat Linux 7.3 kernel 2.4.18) Condor 6.6.10. Not
> sure if this is our binary or condor causing the problem. Happens to
> only a couple of nodes each day, hard to troubleshoot. Anyone else have
> problems similar Linux kernel dumps?
> 

That's a pretty old kernel, and it's known to have problems with SMPs:

http://seclists.org/lists/linux-kernel/2002/Jun/0785.html

-Erik

> 
> CPU: 1 
> ip: 0010:[<f884a0b4>] Not Tainted 
> Flags:00010286 
> IP is at Journal_Commit_transaction [jbd] 0xb04 (2.4.18-3 bigmem) 
> ax:00000016 ebx:0000000a ecx:c02efac0 edx:00003f88 
> si:f23be9c0 edi:f5c5e480 ebp:f59f2000 esp:f59f3e78 
> Process Kjournald (PID:17, stackpage=f59f3000 
> Stack: f8850ebe 00000217 00000000 0000000n 00000fde f16f4024 00000000
> f703b180 
> f23b0ea0 0000lcc7 37363534 00000000 00000001 00000000 ecf5de80 f0a7c800 
> f0a7e700 f0a7e800 f0a7e900 f0a7e00 f0a7eb00 f14cd400 f146d480 f146d500 
> all trace: [<f8850ebe>] .rodata.str1.1[jbd] 0x26c 
> [<c0116079>] smp_apic_timer_interupt [kernel] 0xaq 
> [<c01191f87>] schedule [kernel] 0x348 
> [<f884c776>] kjournald [jbd] 0x136 
> [<f884c620>] commit_timeout [jbd] 0x0 
> [<c0107286>] kernel_thread [kernel] 0x26 
> [<f884c640>] kjournald [jbd] 0x0 
> 
> mode: 0f 0b 5a 59 6a 04 8b 44 24 18 50 56 e8 4b f1 ff 8d 47 48
> 
> 
> 
> ------------------------------------------------------------------------------
> CONFIDENTIALITY NOTICE: If you have received this email in error, please immediately notify the sender by e-mail at the address shown.  This email transmission may contain confidential information.  This information is intended only for the use of the individual(s) or entity to whom it is intended even if addressed incorrectly.  Please delete it from your files if you are not the intended recipient.  Thank you for your compliance.  Copyright 2005 CIGNA
> ==============================================================================

> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users