Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Bug STARTER at <ip> failed to send file(s) to <<ip>:9618>; remaps resulted in a cycle:
- Date: Fri, 17 Apr 2020 18:37:52 +0000
- From: Zach Miller <zmiller@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Bug STARTER at <ip> failed to send file(s) to <<ip>:9618>; remaps resulted in a cycle:
Hello Vikrant,
Just wanted to let you know we increased the default from 20 to 128 for now. This change will appear in 8.8.9 and 8.9.7.
https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=7581
Thanks for reporting the issue!
Cheers,
-zach
ïOn 4/6/20, 10:07 AM, "HTCondor-users on behalf of ervikrant06@xxxxxxxxx" <htcondor-users-bounces@xxxxxxxxxxx on behalf of ervikrant06@xxxxxxxxx> wrote:
We encounter this issue during normal operation of cluster while tracking the issue found this limit factor.
Thanks for your help.
Thanks & Regards,
Vikrant Aggarwal
On Mon, Apr 6, 2020 at 7:08 PM Zach Miller <zmiller@xxxxxxxxxxx> wrote:
Hello,
No, this won't affect the scalability or performance of a pool.
Just curious, did you run into this because you actually had a 20-deep directory hierarchy? Or just testing the limits of HTCondor?
Anyhow, thanks for the report. Given that the performance impact is very minimal, perhaps we should bump the default ourselves.
Cheers,
-zach
On 4/6/20, 2:31 AM, "HTCondor-users on behalf of
ervikrant06@xxxxxxxxx <mailto:ervikrant06@xxxxxxxxx>" <htcondor-users-bounces@xxxxxxxxxxx on behalf of
ervikrant06@xxxxxxxxx> wrote:
Hello Zach,
Thanks for quick response.
Any known implications of bumping this value on cluster consists of 500 nodes?
Thanks & Regards,
Vikrant Aggarwal
On Mon, Apr 6, 2020 at 11:08 AM Zach Miller <zmiller@xxxxxxxxxxx> wrote:
Hello Vikrant,
This is a circuit breaker in the code to prevent infinite recursion when doing filename remaps during transfer.
The default (as you found) is that 20 directories deep means something might be going wrong. Luckily, there is a config setting you can change that determines how deep HTCondor will go before it considers it a problem. In your configuration, set:
MAX_REMAP_RECURSIONS = 100 # Or any number you like if you need to go deeper
Let me know if that works for you.
Cheers,
-zach
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of "ervikrant06@xxxxxxxxx" <ervikrant06@xxxxxxxxx>
Reply-To: HTCondor-Users List <htcondor-users@xxxxxxxxxxx>
Date: Monday, April 6, 2020 at 12:15 AM
To: HTCondor-Users List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Bug STARTER at <ip> failed to send file(s) to <<ip>:9618>; remaps resulted in a cycle:
Hello Experts,
Jobs went into held status with the messae shown in description. This happens when path mentioned for output/log attribute having more path with more than 19 nested directory.
Not working: "/tmp/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/out.txt"
working: ""/tmp/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/out.txt"
Thanks & Regards,
Vikrant Aggarwal
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx <mailto:htcondor-users-request@xxxxxxxxxxx> with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users <https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users>
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/ <https://lists.cs.wisc.edu/archive/htcondor-users/>
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx <mailto:htcondor-users-request@xxxxxxxxxxx> with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/