[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [EXTERN] Re: problem with Procd



Hey Greg, hey Benoit,

Thanks for the help!
I started an EP after setting USE_PROCD = False, and it's still running.

Cheers
Dirk

On 07/06/2024 16:05, Greg Thain via HTCondor-users wrote:
On 6/7/24 07:27, Benoit Roland wrote:
Hi Dirk,

it seems like you got a short read as described here [1].

There are not too much explanations about why that happens, but for the time being, you could disable the condor_procd by setting:

USE_PROCD = False

in your condor configuration and restart condor afterwards:


Hi Dirk:


Benoit is correct in this assessment. We used to see this problem intermittently on el7 systems, and could never get to the bottom of why /proc was giving us incorrect information. We never saw the problem after we upgraded our EPs to el8. Note this is a function of the kernel, and as containers use the kernels of the host, it is the host kernel that matters.


-greg


condor_restart.

With the condor_procd not running, each HTCondor daemon will have to run the tracking and management of process families on its own.

I guess it's not optimal for the scalability of the system, but at least, you will have a running EP for the time being.

Cheers,
Benoit

[1] https://htcondor.readthedocs.io/en/latest/man-pages/condor_procd.html#dealing-with-short-reads
Â

On 07/06/2024 13:03, Dirk Sammel wrote:
Dear experts,

We're currently setting up an HTCondor environment, but the EPs crash after some time. The error in the Masterlog is:


04/25/24 16:53:35 Daemons::StartAllDaemons all daemons were started
04/25/24 16:53:37 Setting ready state 'Ready' for STARTD
04/25/24 17:53:35 Preen pid is 9390
04/25/24 17:53:36 Preen (pid 9390) exited with status 0
04/26/24 03:28:05 procd (pid = 30728) exited unexpectedly with status 256
04/26/24 03:28:07 attempting to restart the Procd
04/26/24 03:28:09 start_procd: error received from procd: error: getProcInfo failed on own PID

04/26/24 03:28:09 restarting the Procd failed
04/26/24 03:28:09 attempting to restart the Procd
04/26/24 03:28:09 start_procd: error received from procd: error: getProcInfo failed on own PID

04/26/24 03:28:09 restarting the Procd failed
04/26/24 03:28:09 attempting to restart the Procd
04/26/24 03:28:09 start_procd: error received from procd: error: getProcInfo failed on own PID

04/26/24 03:28:09 restarting the Procd failed
04/26/24 03:28:09 attempting to restart the Procd
04/26/24 03:28:09 start_procd: error received from procd: error: getProcInfo failed on own PID

04/26/24 03:28:09 restarting the Procd failed
04/26/24 03:28:09 attempting to restart the Procd
04/26/24 03:28:09 start_procd: error received from procd: error: getProcInfo failed on own PID

04/26/24 03:28:09 restarting the Procd failed
04/26/24 03:28:09 ERROR "unable to restart the ProcD after several tries" at line 713 in file /var/lib/condor/execute/slot1/dir_486586/userdir/build-rN7NZr/BUILD/condor-23.6.1/src/condor_utils/proc_family_proxy.cpp
Caught signal 11: si_code=1, si_pid=8, si_uid=0, si_addr=0x8
Stack dump for process 30726 at timestamp 1714094889 (20 frames)
/lib64/libcondor_utils_23_6_1.so(_Z18dprintf_dump_stackv+0x28)[0x7f74bc6e5fb8]
/lib64/libcondor_utils_23_6_1.so(_Z17unix_sig_coredumpiP9siginfo_tPv+0x6c)[0x7f74bc8de1bc]
/lib64/libpthread.so.0(+0x12cf0)[0x7f74ba945cf0]
/lib64/libcondor_utils_23_6_1.so(_ZN16ProcFamilyClient13signal_familyEi21proc_family_command_tRb+0x26)[0x7f74bc9051f6]
/lib64/libcondor_utils_23_6_1.so(_ZN15ProcFamilyProxy11kill_familyEi+0x4a)[0x7f74bc77c21a]
/lib64/libcondor_utils_23_6_1.so(_ZN10DaemonCore11Kill_FamilyEi+0x1a)[0x7f74bc8c40da]
condor_master(_ZNK6daemon10KillFamilyEv+0x23)[0x5559a3717e53]
condor_master(_ZN6daemon8HardKillEi+0x59)[0x5559a3717ed9]
condor_master(_ZN7Daemons18HardKillAllDaemonsEv+0x8b)[0x5559a3719e7b]
condor_master(DoCleanup+0x46)[0x5559a37124c6]
/lib64/libcondor_utils_23_6_1.so(_Z8_EXCEPT_PKcz+0x13a)[0x7f74bc5f28da]
/lib64/libcondor_utils_23_6_1.so(_ZN15ProcFamilyProxy24recover_from_procd_errorEv+0x18b)[0x7f74bc77c06b]
/lib64/libcondor_utils_23_6_1.so(_ZN15ProcFamilyProxy12procd_reaperEii+0x70)[0x7f74bc77c3c0]
/lib64/libcondor_utils_23_6_1.so(_ZN10DaemonCore10CallReaperEiPKcii+0x1c5)[0x7f74bc8c4415]
/lib64/libcondor_utils_23_6_1.so(_ZN10DaemonCore17HandleProcessExitEii+0x2d6)[0x7f74bc8db4d6]
/lib64/libcondor_utils_23_6_1.so(_ZN10DaemonCore24HandleDC_SERVICEWAITPIDSEi+0x5a)[0x7f74bc8db73a]
/lib64/libcondor_utils_23_6_1.so(_ZN10DaemonCore6DriverEv+0x8c1)[0x7f74bc8ca111]
/lib64/libcondor_utils_23_6_1.so(_Z7dc_mainiPPc+0x1787)[0x7f74bc8ec4b7]
/lib64/libc.so.6(__libc_start_main+0xe5)[0x7f74ba5a8d85]
condor_master(_start+0x2e)[0x5559a370f8ae]





The relevant part in ProcLog:


04/26/24 03:28:05 : taking a snapshot...
04/26/24 03:28:05 : ProcAPI: read 0 pid entries out of 0 total entries in /proc
04/26/24 03:28:05 : ProcAPI: detected invalid read of /proc.
04/26/24 03:28:05 : ProcAPI: previous PID list: 1 2 4 6 7 8 9 10 11 12 13 14 16 18 19 20 22 23 24 25 27 28 29 30 32 33 34 35 37 38 39 40 42 43 44 45 47 48 49 50 52 53 54 55 57 58 59 60 62 64 65 66 68 69 70 71 73 74 75 76 78 79 80 81 83 84 85 86 87 88 89 90 91 93 94 95 96 98 99 100 101 103 104 105 106 108 109 110 111 113 114 115 116 118 119 120 121 123 124 125 126 128 129 130 131\
Â133 134 135 136 138 139 140 141 143 144 145 146 148 149 150 151 153 154 155 156 158 159 160 161 163 164 165 166 168 169 170 171 173 174 175 176 178 179 180 181 183 184 185 186 188 189 190 191 193 194 195 196 198 199 200 201 203 204 205 206 208 211 212 213 214 215 216 217 218 219 220 221 222 223 230 231 232 233 234 242 244 249 250 254 255 256 269 297 298 300 308 310 312 317 321 \
323 369 372 453 509 510 520 535 536 537 538 539 540 541 542 546 547 548 549 550 551 552 553 554 555 556 557 559 564 611 612 692 701 714 723 823 824 833 885 886 887 890 891 892 897 898 907 908 909 910 911 912 915 919 923 924 925 926 927 928 934 935 944 945 946 947 948 949 950 951 952 953 954 963 964 965 966 967 968 969 970 1010 1017 1018 1046 1120 1174 1180 1181 1281 1296 1297 12\
98 1299 1304 1306 1412 1414 1415 1416 1417 1418 1419 1420 1423 1433 1445 1450 1454 1472 1488 1489 1501 1516 1543 1545 1546 1548 1549 1556 1557 1576 1583 1585 1607 1612 1707 1714 1722 1727 1739 1740 1755 1756 1781 1782 1791 1792 1793 1794 1795 1796 1798 1826 1831 1866 1897 1918 1942 1988 2008 2082 2095 2101 2102 2153 2215 2222 2223 2238 2241 2278 2308 2330 2356 2359 2360 2361 236\
2 2363 2390 2423 2427 2464 2479 2480 2483 2494 2500 2506 2508 2518 2527 2551 2554 2562 2565 2574 2621 2623 2624 2625 2678 2681 2696 2697 2777 2779 2787 2788 2799 2800 2822 2823 2824 2825 2826 2828 2829 2830 2893 2920 2973 3025 3113 3130 3154 3155 3241 3243 3244 3384 3418 3731 3776 3886 4128 4291 4382 4497 4618 4771 4863 4950 5157 5400 5410 5459 5595 6014 6198 6214 6218 6598 6781\
Â6888 7044 7056 7057 7058 7059 7060 7063 7091 7092 7199 7435 7620 7665 7691 7727 7834 8007 8261 8275 8290 8378 8436 8437 8518 8522 8553 8678 8679 8680 8839 8873 9033 9550 9632 9657 9787 9841 10509 10897 10898 10968 11183 11196 11197 11244 11305 11378 11439 11573 11574 11575 11954 11959 11965 12243 12528 12557 12572 12671 12859 12860 13067 13195 13230 13439 13696 13697 13723 1409\
7 14098 14099 14100 14210 14840 14983 14996 15000 15691 15850 15985 16669 16703 16878 16937 17020 17039 17185 17207 17358 17592 17621 17746 17758 18180 18194 18513 18625 18649 18663 18693 18705 18708 18728 18735 18739 18786 18801 18829 18833 18836 18853 19104 19105 19202 19220 19235 19250 19255 19456 19477 19478 19748 19815 19837 19854 19887 19901 19902 19939 20192 20550 20552 2\
0554 20581 20812 20851 20887 20888 21079 21121 21147 21152 21200 21201 21202 21248 21269 21326 21357 21411 21417 21437 21443 21490 21509 21517 21556 21565 21966 21972 22124 22145 22150 22470 22583 22774 22820 22852 23258 23395 23399 23736 23739 23742 24007 24046 24119 24138 24242 24273 24418 24453 24530 24531 24535 24579 25123 25486 25559 25589 25678 25760 25792 25900 25911 2591\
2 25973 25987 26011 26012 26274 26283 26336 26380 26902 27147 27148 27352 27485 27545 27573 27612 27648 27657 27676 27749 27872 27876 27896 27913 27952 27992 27997 28001 28006 28063 28067 28085 28086 28475 28791 28824 28969 28970 28987 28988 29592 30047 30208 30266 30268 30376 30510 30726 30728 30729 30738 30940 30999 31048 31104 31229 31365 31475 31721 31796 31918 32206 32253 3\
2254 32339 32344 32408 32872 32894 32895 32908 32917 32930 32966 32979 32996 33031 33079 33120 33171 33347 33371 33437 33720 33745 33746 33747 33911 33973 34374 34543 34721 34762 34978 35240 35500 36474 36673 36707 36722 36764 37007 37065 37110 37363 37368 37414 37415 37437 37466 37480 38316 38332 38339 38688 39184 39301 39493 39495 39525 39526 39530 39544 39574 39575 39577 3959\
9 39845 39977 40073 40076 40280 40319 40343 40376 40385 40656 40792 40805 40864 40902 40935
04/26/24 03:28:05 : ProcAPI: new PID list: 1 2 4 6 7 8 9 10 11 12 13 14 16 18 19 20 22 23 24 25 27 28 29 30 32 33 34 35 37 38 39 40 42 43 44 45 47 48 49 50 52 53 54 55 57 58 59 60 62 64 65 66 68 69 70 71 73 74 75 76 78 79 80 81 83 84 85 86 87 88 89 90 91 93 94 95 96 98 99 100 101 103 104 105 106 108 109 110 111 113 114 115 116 118 119 120 121 123 124 125 126 128 129 130 131 133 \
134 135 136 138 139 140 141 143 144 145 146 148 149 150 151 153 154 155 156 158 159 160 161 163 164 165 166 168 169 170 171 173 174 175 176 178 179 180 181 183 184 185 186 188 189 190 191 193 194 195 196 198 199 200 201 203 204 205 206 208 211 212 213 214 215 216 217 218 219 220 221 222 223 230 231 232 233 234 242 244 249 250 254 255 256 269 297 298 300 308 310 312 317 321 323 3\
69 372 453 509 510 520 535 536 537 538 539 540 541 542 546 547 548 549 550 551 552 553 554 555 556 557 559 564 611 612 692 701 714 723 823 824 833 885 886 887 890 891 892 897 898 907 908 909 910 911 912 915 919 923 924 925 926 927 928 934 935 944 945 946 947 948 949 950 951 952 953 954 963 964 965 966 967 968 969 970 1010 1017 1018 1046 1120 1174 1180 1181 1281 1296 1297 1298 12\
99 1304 1306 1412 1414 1415 1416 1417 1418 1419 1420 1423 1433 1445 1450 1454 1472 1488 1489 1501 1516 1543 1545 1546 1548 1549 1556 1557 1576 1583 1585 1607 1612 1707 1714 1722 1727 1739 1740 1755 1756 1781 1782 1791 1792 1793 1794 1795 1796 1798 1826 1831 1866 1897 1918 1942 1988 2008 2082 2095 2101 2102 2153 2215 2222 2223 2238 2241 2278 2308 2330 2356 2359 2360 2361 2362 236\
3 2390 2423 2427 2464 2479 2480 2483 2494 2500 2506 2508 2518 2527 2551 2554 2562 2565 2574 2621 2623 2624 2625 2678 2681 2696 2697 2777 2779 2787 2788 2799 2800 2822 2823 2824 2825 2826 2828 2829 2830 2893 2920 2973 3025 3113 3130 3154 3155 3241 3243 3244 3384 3418 3731 3776 3886 4128 4291 4382 4497 4618 4771 4863 4950 5157 5400 5410 5459 5595 6014 6198 6214 6218 6598 6781 6888\
Â7044 7056 7057 7058 7059 7060 7063 7091 7092 7199 7435 7620 7665 7691 7727 7834 8007 8261 8275 8290 8378 8436 8437 8518 8522 8553 8678 8679 8680 8839 8873 9033 9550 9632 9657 9787 9841 10509 10897 10898 10968 11183 11196 11197 11244 11305 11378 11439 11573 11574 11575 11954 11959 11965 12243 12528 12557 12572 12671 12859 12860 13067 13195 13230 13439 13696 13697 13723 14097 140\
98 14099 14100 14210 14840 14983 14996 15000 15691 15850 15985 16669 16703 16878 16937 17020 17039 17185 17207 17358 17592 17621 17746 17758 18180 18194 18513 18625 18649 18663 18693 18705 18708 18728 18735 18739 18786 18801 18829 18833 18836 18853 19104 19105 19202 19220 19235 19250 19255 19456 19477 19478 19748 19815 19837 19854 19887 19901 19902 19939 20192 20550 20552 20554 \
20581 20812 20851 20887 20888 21079 21121 21147 21152 21200 21201 21202 21248 21269 21326 21357 21411 21417 21437 21443 21490 21509 21517 21556 21565 21966 21972 22124 22145 22150 22470 22583 22774 22820 22852 23258 23395 23399 23736 23739 23742 24007 24046 24119 24138 24242 24273 24418 24453 24530 24531 24535 24579 25123 25486 25559 25589 25678 25760 25792 25900 25911 25912 259\
73 25987 26011 26012 26274 26283 26336 26380 26902 27147 27148 27352 27485 27545 27573 27612 27648 27657 27676 27749 27872 27876 27896 27913 27952 27992 27997 28001 28006 28063 28067 28085 28086 28475 28791 28824 28969 28970 28987 28988 29592 30047 30208 30266 30268 30376 30510 30726 30728 30729 30738 30940 30999 31048 31104 31229 31365 31475 31721 31796 31918 32206 32253 32254 \
32339 32344 32408 32872 32894 32895 32908 32917 32930 32966 32979 32996 33031 33079 33120 33171 33347 33371 33437 33720 33745 33746 33747 33911 33973 34374 34543 34721 34762 34978 35240 35500 36474 36673 36707 36722 36764 37007 37065 37110 37363 37368 37414 37415 37437 37466 37480 38316 38332 38339 38688 39184 39301 39493 39495 39525 39526 39530 39544 39574 39575 39577 39599 398\
45 39977 40073 40076 40280 40319 40343 40376 40385 40656 40792 40805 40864 40902 40935
04/26/24 03:28:05 : ProcAPI: retrying.
04/26/24 03:28:05 : ProcAPI: read 0 pid entries out of 0 total entries in /proc
04/26/24 03:28:05 : ProcAPI: detected invalid read of /proc.
04/26/24 03:28:05 : ProcAPI: previous PID list: 1 2 4 6 7 8 9 10 11 12 13 14 16 18 19 20 22 23 24 25 27 28 29 30 32 33 34 35 37 38 39 40 42 43 44 45 47 48 49 50 52 53 54 55 57 58 59 60 62 64 65 66 68 69 70 71 73 74 75 76 78 79 80 81 83 84 85 86 87 88 89 90 91 93 94 95 96 98 99 100 101 103 104 105 106 108 109 110 111 113 114 115 116 118 119 120 121 123 124 125 126 128 129 130 131\
Â133 134 135 136 138 139 140 141 143 144 145 146 148 149 150 151 153 154 155 156 158 159 160 161 163 164 165 166 168 169 170 171 173 174 175 176 178 179 180 181 183 184 185 186 188 189 190 191 193 194 195 196 198 199 200 201 203 204 205 206 208 211 212 213 214 215 216 217 218 219 220 221 222 223 230 231 232 233 234 242 244 249 250 254 255 256 269 297 298 300 308 310 312 317 321 \
323 369 372 453 509 510 520 535 536 537 538 539 540 541 542 546 547 548 549 550 551 552 553 554 555 556 557 559 564 611 612 692 701 714 723 823 824 833 885 886 887 890 891 892 897 898 907 908 909 910 911 912 915 919 923 924 925 926 927 928 934 935 944 945 946 947 948 949 950 951 952 953 954 963 964 965 966 967 968 969 970 1010 1017 1018 1046 1120 1174 1180 1181 1281 1296 1297 12\
98 1299 1304 1306 1412 1414 1415 1416 1417 1418 1419 1420 1423 1433 1445 1450 1454 1472 1488 1489 1501 1516 1543 1545 1546 1548 1549 1556 1557 1576 1583 1585 1607 1612 1707 1714 1722 1727 1739 1740 1755 1756 1781 1782 1791 1792 1793 1794 1795 1796 1798 1826 1831 1866 1897 1918 1942 1988 2008 2082 2095 2101 2102 2153 2215 2222 2223 2238 2241 2278 2308 2330 2356 2359 2360 2361 236\
2 2363 2390 2423 2427 2464 2479 2480 2483 2494 2500 2506 2508 2518 2527 2551 2554 2562 2565 2574 2621 2623 2624 2625 2678 2681 2696 2697 2777 2779 2787 2788 2799 2800 2822 2823 2824 2825 2826 2828 2829 2830 2893 2920 2973 3025 3113 3130 3154 3155 3241 3243 3244 3384 3418 3731 3776 3886 4128 4291 4382 4497 4618 4771 4863 4950 5157 5400 5410 5459 5595 6014 6198 6214 6218 6598 6781\
Â6888 7044 7056 7057 7058 7059 7060 7063 7091 7092 7199 7435 7620 7665 7691 7727 7834 8007 8261 8275 8290 8378 8436 8437 8518 8522 8553 8678 8679 8680 8839 8873 9033 9550 9632 9657 9787 9841 10509 10897 10898 10968 11183 11196 11197 11244 11305 11378 11439 11573 11574 11575 11954 11959 11965 12243 12528 12557 12572 12671 12859 12860 13067 13195 13230 13439 13696 13697 13723 1409\
7 14098 14099 14100 14210 14840 14983 14996 15000 15691 15850 15985 16669 16703 16878 16937 17020 17039 17185 17207 17358 17592 17621 17746 17758 18180 18194 18513 18625 18649 18663 18693 18705 18708 18728 18735 18739 18786 18801 18829 18833 18836 18853 19104 19105 19202 19220 19235 19250 19255 19456 19477 19478 19748 19815 19837 19854 19887 19901 19902 19939 20192 20550 20552 2\
0554 20581 20812 20851 20887 20888 21079 21121 21147 21152 21200 21201 21202 21248 21269 21326 21357 21411 21417 21437 21443 21490 21509 21517 21556 21565 21966 21972 22124 22145 22150 22470 22583 22774 22820 22852 23258 23395 23399 23736 23739 23742 24007 24046 24119 24138 24242 24273 24418 24453 24530 24531 24535 24579 25123 25486 25559 25589 25678 25760 25792 25900 25911 2591\
2 25973 25987 26011 26012 26274 26283 26336 26380 26902 27147 27148 27352 27485 27545 27573 27612 27648 27657 27676 27749 27872 27876 27896 27913 27952 27992 27997 28001 28006 28063 28067 28085 28086 28475 28791 28824 28969 28970 28987 28988 29592 30047 30208 30266 30268 30376 30510 30726 30728 30729 30738 30940 30999 31048 31104 31229 31365 31475 31721 31796 31918 32206 32253 3\
2254 32339 32344 32408 32872 32894 32895 32908 32917 32930 32966 32979 32996 33031 33079 33120 33171 33347 33371 33437 33720 33745 33746 33747 33911 33973 34374 34543 34721 34762 34978 35240 35500 36474 36673 36707 36722 36764 37007 37065 37110 37363 37368 37414 37415 37437 37466 37480 38316 38332 38339 38688 39184 39301 39493 39495 39525 39526 39530 39544 39574 39575 39577 3959\
9 39845 39977 40073 40076 40280 40319 40343 40376 40385 40656 40792 40805 40864 40902 40935
04/26/24 03:28:05 : ProcAPI: new PID list: 1 2 4 6 7 8 9 10 11 12 13 14 16 18 19 20 22 23 24 25 27 28 29 30 32 33 34 35 37 38 39 40 42 43 44 45 47 48 49 50 52 53 54 55 57 58 59 60 62 64 65 66 68 69 70 71 73 74 75 76 78 79 80 81 83 84 85 86 87 88 89 90 91 93 94 95 96 98 99 100 101 103 104 105 106 108 109 110 111 113 114 115 116 118 119 120 121 123 124 125 126 128 129 130 131 133 \
134 135 136 138 139 140 141 143 144 145 146 148 149 150 151 153 154 155 156 158 159 160 161 163 164 165 166 168 169 170 171 173 174 175 176 178 179 180 181 183 184 185 186 188 189 190 191 193 194 195 196 198 199 200 201 203 204 205 206 208 211 212 213 214 215 216 217 218 219 220 221 222 223 230 231 232 233 234 242 244 249 250 254 255 256 269 297 298 300 308 310 312 317 321 323 3\
69 372 453 509 510 520 535 536 537 538 539 540 541 542 546 547 548 549 550 551 552 553 554 555 556 557 559 564 611 612 692 701 714 723 823 824 833 885 886 887 890 891 892 897 898 907 908 909 910 911 912 915 919 923 924 925 926 927 928 934 935 944 945 946 947 948 949 950 951 952 953 954 963 964 965 966 967 968 969 970 1010 1017 1018 1046 1120 1174 1180 1181 1281 1296 1297 1298 12\
99 1304 1306 1412 1414 1415 1416 1417 1418 1419 1420 1423 1433 1445 1450 1454 1472 1488 1489 1501 1516 1543 1545 1546 1548 1549 1556 1557 1576 1583 1585 1607 1612 1707 1714 1722 1727 1739 1740 1755 1756 1781 1782 1791 1792 1793 1794 1795 1796 1798 1826 1831 1866 1897 1918 1942 1988 2008 2082 2095 2101 2102 2153 2215 2222 2223 2238 2241 2278 2308 2330 2356 2359 2360 2361 2362 236\
3 2390 2423 2427 2464 2479 2480 2483 2494 2500 2506 2508 2518 2527 2551 2554 2562 2565 2574 2621 2623 2624 2625 2678 2681 2696 2697 2777 2779 2787 2788 2799 2800 2822 2823 2824 2825 2826 2828 2829 2830 2893 2920 2973 3025 3113 3130 3154 3155 3241 3243 3244 3384 3418 3731 3776 3886 4128 4291 4382 4497 4618 4771 4863 4950 5157 5400 5410 5459 5595 6014 6198 6214 6218 6598 6781 6888\
Â7044 7056 7057 7058 7059 7060 7063 7091 7092 7199 7435 7620 7665 7691 7727 7834 8007 8261 8275 8290 8378 8436 8437 8518 8522 8553 8678 8679 8680 8839 8873 9033 9550 9632 9657 9787 9841 10509 10897 10898 10968 11183 11196 11197 11244 11305 11378 11439 11573 11574 11575 11954 11959 11965 12243 12528 12557 12572 12671 12859 12860 13067 13195 13230 13439 13696 13697 13723 14097 140\
98 14099 14100 14210 14840 14983 14996 15000 15691 15850 15985 16669 16703 16878 16937 17020 17039 17185 17207 17358 17592 17621 17746 17758 18180 18194 18513 18625 18649 18663 18693 18705 18708 18728 18735 18739 18786 18801 18829 18833 18836 18853 19104 19105 19202 19220 19235 19250 19255 19456 19477 19478 19748 19815 19837 19854 19887 19901 19902 19939 20192 20550 20552 20554 \
20581 20812 20851 20887 20888 21079 21121 21147 21152 21200 21201 21202 21248 21269 21326 21357 21411 21417 21437 21443 21490 21509 21517 21556 21565 21966 21972 22124 22145 22150 22470 22583 22774 22820 22852 23258 23395 23399 23736 23739 23742 24007 24046 24119 24138 24242 24273 24418 24453 24530 24531 24535 24579 25123 25486 25559 25589 25678 25760 25792 25900 25911 25912 259\
73 25987 26011 26012 26274 26283 26336 26380 26902 27147 27148 27352 27485 27545 27573 27612 27648 27657 27676 27749 27872 27876 27896 27913 27952 27992 27997 28001 28006 28063 28067 28085 28086 28475 28791 28824 28969 28970 28987 28988 29592 30047 30208 30266 30268 30376 30510 30726 30728 30729 30738 30940 30999 31048 31104 31229 31365 31475 31721 31796 31918 32206 32253 32254 \
32339 32344 32408 32872 32894 32895 32908 32917 32930 32966 32979 32996 33031 33079 33120 33171 33347 33371 33437 33720 33745 33746 33747 33911 33973 34374 34543 34721 34762 34978 35240 35500 36474 36673 36707 36722 36764 37007 37065 37110 37363 37368 37414 37415 37437 37466 37480 38316 38332 38339 38688 39184 39301 39493 39495 39525 39526 39530 39544 39574 39575 39577 39599 398\
45 39977 40073 40076 40280 40319 40343 40376 40385 40656 40792 40805 40864 40902 40935
04/26/24 03:28:05 : ProcAPI: giving up, retaining previous PID list.
04/26/24 03:28:05 : ProcAPI::getProcInfo() pid 1 does not exist.
04/26/24 03:28:05 : ProcAPI::getProcInfo() pid 2 does not exist.
04/26/24 03:28:05 : ProcAPI::getProcInfo() pid 4 does not exist.
04/26/24 03:28:05 : ProcAPI::getProcInfo() pid 6 does not exist.
04/26/24 03:28:05 : ProcAPI::getProcInfo() pid 7 does not exist.

and so on, and so on.

The EPs are running inside a Rocky Linux 8 container, and we're using version 23.6.1, but we also had the problem with version 23.4.0.
The underlying system runs on CentOS 7.9.2009. We use the same container on another system that runs on RHEL 8.8, and there we don't observe this problem.

Any idea what could be the problem here?

I can provide you with more information about the two systems, and with further logs if you need any.

Cheers
Dirk

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/