HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] FW: Bug Report: Esoteric Windows email problem. (AM-47)



This is a follow-up email from the submitter of the windows email
problem:
-Mike

-----Original Message-----
From: Finley, Danny (Mission Systems) 
Sent: Thursday, October 27, 2005 5:31 AM
To: Michael Yoder
Subject: RE: Bug Report: Esoteric Windows email problem.

Mr. Yoder,

Thank you for your prompt reply and your help investigating the problem.


We did some testing and verified that the temporary file was
successfully created in C:\WINDOWS\TMP and it was named condoremailX
where X was some integer. We also set the TMP environment variable (at
the system level, not the user level), but that did not seem to solve
the problem. 

As you mention, the information in the condoremail temp file should be
the body of the email message, and it is piped into condor_mail.exe. We
verified by running condor_mail.exe outside of Condor that it will
silently fail if a file that does not exist (or cannot be read by the
running user) is piped to it. 

Hopefully this problem will be corrected in a future release.

Thanks again,

Danny Finley
 
-----Original Message-----
From: Michael Yoder [mailto:yoderm@xxxxxxxxxx] 
Sent: Wednesday, October 26, 2005 4:26 PM
To: condor-devel@xxxxxxxxxxx
Cc: Finley, Danny (Mission Systems)
Subject: Bug Report: Esoteric Windows email problem.


OS: Windows
Condor Version: all

Problem: 
The condor_shadow silently fails to send email when its current working
directory is on a shared file system.  The user would submit a job
remotely (SCHEDD_HOST was set) and specify a shared file system for file
locations, and they wouldn't get email.  When condor_submit -s was used
(telling condor to copy everything to the spool directory), then they
_would_ get email...but they didn't want to use -s.  There were no
errors of any sort in the log files, even with D_ALL.  They saw the line

Sending email via system(C:\Condor/bin/condor_mail.exe -s "[Condor]
Condor Job 1757.0" -relay ...)

in the Shadow log file no matter what.
 
Bizarre, eh?

The problem is one of permissions.  In email_open_implementation() (in
email.c), condor creates a temporary file using

email_tmp_file = tempnam("\\tmp","condoremail");

and does this outside of the set_priv() block - since this is the
shadow, this call, and the fopen() of email_tmp_file are done as the
submitting user.  The MSDN page for tempnam:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore9
8/html/_crt__tempnam.2c_._wtempnam.2c_.tmpnam.2c_._wtmpnam.asp

Says that the file could be created in the directory (in order)...

- The value of the TMP environment variable, if it's valid
- \\tmp if it exists
- P_tmpdir (from stdio.h)
- The current working directory

I'm guessing it fell through to the last one, the current working
directory.  SO...now there's a temporary file, created by the submitting
user, sitting on a shared file system.  We then write the email message
to it, and then come to email_close().

In email_close, we:

priv = set_condor_priv();

and then essentially:

system ( "condor_mail.exe ... < email_tmp_file" );

The problem is that at this point we're user condor (which is local
system on windows) and hence we don't have permission to read the
temporary file that we created from tempnam.  The result is a silent
failure.

To solve this, I'd recommend having condor privs when the tempnam() and
fopen() are called, and I'd recommend using $(LOG) or $(SPOOL) instead
of \\tmp.  I believe (but haven't proved) that a workaround is to set
the TMP environment variable.


Disclaimer: I haven't verified any of the above myself; I've just been
staring at code.  However, it's the only explanation given the symptoms.

Mike Yoder
Principal Member of Technical Staff
Ask Mike: http://docs.optena.com
Direct  : +1.408.321.9000
Fax     : +1.408.321.9030
Mobile  : +1.408.497.7597
yoderm@xxxxxxxxxx

Optena Corporation
2860 Zanker Road, Suite 201
San Jose, CA 95134
http://www.optena.com