Mr. Agarwal,
First of all, based on your situation, FILESYSTEM_DOMAIN should be set to $(FULL_HOSTNAME) (not "10.8.0.1, condor-mstr") since they don't share a filesystem. In your submit file, "should_transfer_files" should always be set to "YES" for the same reason. After you change the configuration file, restart both computers to make sure Condor has a fresh start with the new configuration settings. That is odd. HasFileTransfer should be defined, even if it's false for some reason. What version of Condor are you running? 'condor -v' Also, recheck the StartLog for unusual errors or warnings. Do you get anything when you run 'condor_status -long | grep -i transfer'? If not, what is the complete output of 'condor_status -long'? Best Regards, - Garrett condor.cs.wlu.edu From: condor-users-bounces@xxxxxxxxxxx [condor-users-bounces@xxxxxxxxxxx] on behalf of Shiv Agarwal [shiv@xxxxxxxxxxx]
Sent: Friday, August 19, 2011 7:06 PM To: Condor-Users Mail List Subject: Re: [Condor-users] job stuck in idle mode - HasFileTransfer Garrett,
Appreciate your quick reply. I tried the commands you mentioned.
condor_status -long | grep ^HasFileTransfer - did not show any results
condor_status -long | grep ^FileSystemDomain - showed "10.8.0.1, condor-mstr"
10.8.0.1 is the i.p. of my master node and "condor-mstr" is the hostname.
In my execute node
FILESYSTEM_DOMAIN = 10.8.0.1, condor-mstr. I set it to both because when I run condor_config_val -v FILESYSTEM_DOMAIN in my master node it shows me "condor-mstr" but
in my execute node the same command shows the i.p. which is "10.8.0.1"
I do not have NFS setup so I do need to transfer the files.
I don't even see any errors anywhere and what is driving me crazy is that the master does not even seem to try to transfer files. It just presumes that the execute node does not allow it as if something was preset when the execute node first connected
to the master node.
This is my submit file
Universe = vanilla
Requirements = Arch == "INTEL" && Memory >= 32
should_transfer_files = IF_NEEDED
when_to_transfer_output = ON_EXIT
Executable = simple
Arguments = 4 10
Log = outsimple.log
Output = outsimple.$(Process).out
Error = outsimple.error
Queue
Shiv
On Fri, Aug 19, 2011 at 3:54 PM, Koller, Garrett
<kollerg14@xxxxxxxxxxxx> wrote:
Shiv Agarwal |