Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] EncryptExecuteDirectory issues on Windows execute nodes without run_as_owner
- Date: Wed, 15 Sep 2021 01:13:52 -0500 (CDT)
- From: Todd L Miller <tlmiller@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] EncryptExecuteDirectory issues on Windows execute nodes without run_as_owner
One kludge around is to use the âcipherâ command to decrypt the file
before uploading it, e.g.
You could also potentially use HTCondor's file-transfer mechanism,
although it will end up being a little less efficient in this case: if the
submit node can mount \\fileserver, your jobs could terminate after
creating outputfile.dat but specify
transfer_output_files = outputfile.dat
transfer_output_remaps = outputfile.dat=\\fileserver\user\output
HTCondor will read outputfile.dat as the condor-slot user and transfer if
to a daemon running on the submit node as the owner of the job, which
(should) allow that daemon to write to \\fileserver\user\output.
So thatâs the FYI bit, and once users can run_as_owner I donât think
this shouldnât be a problem?
Indeed.
These must be related to the encrypt_execute_directory stuff because we
can re-run the jobs with NO execute directory encryption enabled and do
not get these errors.
Do you re-run all 5,000 jobs and get no failures, or just the
failed 150?
So I guess the question is does anyone have any ideas as to why these
errors are occurring? And only when encryptexecutedirectory is set to
true?
I'm a little more worried by failing to read from the standard
error log after the job has finished than the two errors failing to
create the log files. Failing to write to the log after creating it is
also very strange. It makes me wonder if there's a clean-up process going
astray somewhere, possibly because of a race condition made worse by
encrypting the execute directory.
- ToddM