Hi @all Sorry for cross posting but I have an urgent problem to queue a simple batch job from a Windows node to a Linux node. I posted this problem before on the developer mailing list but got no answer yet. I would like to submit a batch job like a Linux shell script from the windows worker machine to a Linux machine. All job files are on a shared file system and not needed to be send over the network. The submitted jobs must be placed in the Linux node queue and execute there. My settings is: - HTCondor 8.2.2 is installed on both machines (Windows & Linux) - The Windows machine is configured as submitter and executer with access to a common NFS share - The Linux head-node is configured as submitter, executer and master also with access to the same NFS share as Windows - Both are in the same pool, have write and read access to the job and execute files on the NFS share. My approach is to use the condor_submit command with option –name from the Windows machine. So I used the “condor_submit –name server condor.job -debug” command on the windows machine to queue the job to the Linux head node. The job is queued on the Linux machine but goes the hold state there. The condor_submit debug output on the windows machine shows the NFS path to the files and looks ok but the schedlog and “condor_q 109 –better –debug” on the Linux head node tells that the path to the userlog does not exist. Its look like this: “//server/test-job/\//server/test-job/test-jobxxx.log”. I set in addition +PreserveRelativeExecutable = True but it has no changes here. I can execute directly the job from the Linux machine but there must also exists a possibility to send and query from a windows machine or did I miss something? Is it possible to tell condor relative path to execute and log files on the remote machine in the job file and submit it from other machine? Or is it possible to change the IWD path in the submit file to a path on the remote machine? My Job file: condor.job Universe = Vanilla Requirements = ( OpSys == "LINUX" ) && ( TARGET.Arch == "X86_64" ) && \ ( TARGET.Disk >= 1 ) && \ ( TARGET.Memory >= ifthenelse(MemoryUsage isnt undefined,MemoryUsage,1) ) && \ ( ( TARGET.HasFileTransfer ) || ( TARGET.FileSystemDomain == "server.world.loc" ) ) Log = Test-job.$(Cluster).$(Process).log Output = Test-job.$(Cluster).$(Process).out Error = Test-job.$(Cluster).$(Process).error Executable = batch_linux.sh should_transfer_files = IF_NEEDED when_to_transfer_output = ON_EXIT +PreserveRelativeExecutable = True Queue condor_submit output with –debug option on Windows worker: … ** Proc 109.0: WindowsMajorVersion = 6 NTDomain = "WORLD" ExitStatus = 0 NiceUser = false LocalSysCpu = 0.0 CurrentTime = time() CompletionDate = 0 BufferBlockSize = 32768 WindowsBuildNumber = 7601 NumRestarts = 0 MyType = "Job" CumulativeSuspensionTime = 0 TargetType = "Machine" RemoteSysCpu = 0.0 QDate = 1412758624 Owner = "condor" RemoteUserCpu = 0.0 LastSuspensionTime = 0 WindowsMinorVersion = 1 LocalUserCpu = 0.0 WindowsServicePackMajorVersion = 1 WantCheckpoint = false WindowsServicePackMinorVersion = 0 CondorPlatform = "$CondorPlatform: x86_64_Windows8 $" WindowsProductType = 1 WhenToTransferOutput = "ON_EXIT" NumSystemHolds = 0 RemoteWallClockTime = 0.0 NumCkpts = 0 NumJobStarts = 0 CommittedTime = 0 MaxHosts = 1 CommittedSlotTime = 0 CumulativeSlotTime = 0 CoreSize = 0 TotalSuspensions = 0 WantRemoteSyscalls = false DiskUsage = 1 Iwd = "\\SERVER\java-test-job " CommittedSuspensionTime = 0 ExitBySignal = false CondorVersion = "$CondorVersion: 8.2.2 Aug 07 2014 BuildID: 265643 $" CurrentHosts = 0 JobUniverse = 5 RequestCpus = 1 Cmd = "\\SERVER\java-test-job \batch_linux.sh" BufferSize = 524288 MinHosts = 1 JobStatus = 1 ImageSize = 1 EnteredCurrentStatus = 1412758624 JobPrio = 0 Err = "Test-job.109.0.error" UserLog = "\\SERVER\java-test-job\Test-job.109.0.log" Environment = "" JobNotification = 0 WantRemoteIO = true Rank = 0.0 In = "/dev/null" TransferIn = false Out = "Test-job.109.0.out" StreamOut = false StreamErr = false ShouldTransferFiles = "IF_NEEDED" ExecutableSize = 1 TransferInputSizeMB = 0 RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,( ImageSize + 1023 ) / 1024) RequestDisk = DiskUsage Requirements = ( ( OpSys == "LINUX" ) && ( TARGET.Arch == "X86_64" ) && ( TARGET.Disk >= 1 ) && ( TARGET.Memory >= ifthenelse(MemoryUsage =!= undefined,MemoryUsage,1) ) && ( ( TARGET.HasFileTransfer ) || ( TARG ET.FileSystemDomain == "SERVER.world.loc" ) ) ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) FileSystemDomain = "SERVER.world.loc" JobLeaseDuration = 1200 PeriodicHold = false PeriodicRelease = false PeriodicRemove = false > > LeaveJobInQueue = false Arguments = "" PreserveRelativeExecutable = true … Output from condor_q –hold 109 -- Submitter: server.world.loc : <10.149.51.58:51640> : server.world.loc ID OWNER HELD_SINCE HOLD_REASON 109.0 condor 10/8 10:57 Failed to initialize user log to \\SERVER\java-test-job/\\SERVER\java-test-job\Test-job.109.0.log Thomas |