When I said that the directory had permissions of 0644 I actually
meant 0755. Apologies if I mislead anyone, but the point stands. The
jobs don't run until that spool directory is set to 1777.
m
On 07/10/2010 11:55, Mark Calleja wrote:
OK, I can get it to work as expected under v7.4.3 if I change the
permissions on Condor's spool directory on the submit host from
0644 to 1777. However, under v7.2 it worked fine with perms of
just 0644, so why do we now need these less secure settings?
m
On 06/10/2010 11:27, Mark Calleja wrote:
Hi,
Our users have come across a problem for MPI jobs running under
the parallel universe when upgrading from 7.2.5 to 7.4.3, and
though we have found a workaround (mentioned below), it would be
great if we can identify a proper fix.
The issue is that jobs using the "usual" MPI wrapper script
(e.g. mp1script) for such jobs now fail with the following:
In stdout:
error 0 chirp putting identity keys back
In stderr:
Can't chirp_client_open
/home/condor/spool/cluster55247.proc0.subproc0/0.key:-1
Looking in the ShadowLog, it seems that a new permissions
problem rears its head:
09/13 10:48:29 (55247.0) (30445): Request to run on slot1@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <172.24.89.94:9696>
was ACCEPTED
09/13 10:48:29 (55247.0) (30445): FileTransfer::Init():
mkdir(/home/condor/spool/cluster55247.proc0.subproc0) failed,
Permission denied (errno: 13)
We have found that we can get around the issue by spooling the
data on submission, i.e. via "condor_submit -spool" and then
retrieving the data on completion via condor_transfer_data,
before finally removing the job from the queue manually with
condor_rm. This new behaviour is perplexing, as there have been
no new configuration changes made to the hosts on upgrade.
Have we missed something necessary in the upgrade? From the
release notes I can't discern any such new requirement, and
having to remember to manually retrieve output and remove
completed jobs from the queue is a pain in the unmentionables.
Best regards,
Mark
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
--
The Cavendish Laboratory, University of Cambridge,
J J Thomson Avenue, Cambridge, CB3 0HE, UK
Tel. (+44/0) 1223 746627
http://www.escience.cam.ac.uk/~mcal00
--
The Cavendish Laboratory, University of Cambridge,
J J Thomson Avenue, Cambridge, CB3 0HE, UK
Tel. (+44/0) 1223 746627
http://www.escience.cam.ac.uk/~mcal00
|