On Mar 26, 2013, at 2:51 PM, Jordan Williamson <
jordan.williamson@xxxxxxxxxxx> wrote:
> Ah, that makes more sense. I actually am using a custom file-transfer plugin to upload the output files to a different server than the submit machine, and thus don't need the files to be transferred to the submit machine after execution.
>
> How would I prevent Condor from trying to send the files back to the submit node?
>
> On Tue, Mar 26, 2013 at 3:39 PM, Brian Bockelman <
bbockelm@xxxxxxxxxxx> wrote:
> Hi Jordan,
>
> Iwd refers to a directory on the submit machine. If HTCondor is transferring your files between submit and execute nodes, what directory would you like it to use on the submit side.
>
> The file transfer is performed as the submitting user. So, if you submit as user "ubuntu", "/home/ubuntu/" is a fine place for HTCondor to return the output files to.
>
> Typically, if you run "condor_submit", Iwd is set to the directory where you invoked the condor_submit from.
>
> Brian
>
> On Mar 26, 2013, at 2:03 PM, Jordan Williamson <
jordan.williamson@xxxxxxxxxxx> wrote:
>
>> Oh shoot, those are the classads for a job that ran fine (I temporarily set the Iwd to "/home/ubuntu", as I knew that existed).
>>
>> Classads for failing job:
>>
>> ImageSize = 1
>> LeaveJobInQueue = true
>> JobNotification = 2
>> TransferExecutable = false
>> StreamIn = false
>> AutoClusterId = 1
>> StreamErr = false
>> ShouldTransferFiles = "YES"
>> >
>> JobStatus = 1
>> LastJobStatus = 0
>> Owner = "ubuntu"
>> MyType = "Job"
>> Cmd = "/usr/bin/blender"
>> WhenToTransferOutput = "ON_EXIT"
>> GlobalJobId = "<machine-ip>#670.22#1364323301"
>> PeriodicRemove = false
>> ImageSize_RAW = 1
>> User = "ubuntu@<machine-ip>"
>> CurrentTime = time()
>> PeriodicHold = false
>> RootDir = "/"
>> Iwd = "/"
>> >
>> AutoClusterAttrs = "JobUniverse,LastCheckpointPlatform,NumCkpts,jordan,Requirements,NiceUser,ConcurrencyLimits"
>> QDate = 1364323304
>> ClusterId = 670
>> PeriodicRelease = false
>> Requirements = OpSys == "LINUX" && Arch == "INTEL"
>> StreamOut = false
>> Arguments = "-b dolphin.blend -o //render_# -F PNG -x 1 -f $(Process)"
>> TargetType = "Machine"
>> TransferInput = "<url>"
>> RemoteUserCpu = 0
>> JobPrio = 0
>> JobUniverse = 5
>> ProcId = 22
>> ServerTime = 1364324445
>>
>> Hold error:
>>
>> Error from <execute-node>: STARTER at <execute-node> failed to send file(s) to <execute-node>; SHADOW at <execute-node> failed to write to file //_condor_stdout: (errno 13) Permission denied
>>
>> Here, I tried using "/" as the Iwd. If I used something like "/etc", the error would say "failed to write to file /etc/_condor_stdout", etc.
>>
>> On Tue, Mar 26, 2013 at 2:34 PM, Brian Bockelman <
bbockelm@xxxxxxxxxxx> wrote:
>> Hi Jordan,
>>
>> Looks like things are running right now. What is the hold message you eventually receive?
>>
>> FWIW - it would also be interesting to see the ClassAd you give to the Schedd object for submission.
>>
>> Brian
>>
>> On Mar 26, 2013, at 1:29 PM, Jordan Williamson <
jordan.williamson@xxxxxxxxxxx> wrote:
>>
>>> Classads:
>>>
>>> DiskUsage_RAW = 319
>>> Requirements = OpSys == "LINUX" && Arch == "INTEL"
>>> RemoteUserCpu = 0.0
>>> JobFinishedHookDone = 1364322130
>>> >
>>> GlobalJobId = "<machine-ip>#669.23#1364321911"
>>> NumJobStarts = 1
>>> ExitCode = 0
>>> StreamIn = false
>>> ImageSize = 15000
>>> CurrentTime = time()
>>> JobStartDate = 1364322127
>>> CurrentHosts = 0
>>> JobCurrentStartDate = 1364322127
>>> TargetType = "Machine"
>>> ServerTime = 1364322453
>>> LastPublicClaimId = "<machine-ip>#1364246102#73#..."
>>> Cmd = "/usr/bin/blender"
>>> >
>>> TransferExecutable = false
>>> JobUniverse = 5
>>> BytesRecvd = 74.000000
>>> RemoteWallClockTime = 3.000000
>>> JobNotification = 2
>>> Iwd = "/home/ubuntu"
>>> RemoteSysCpu = 0.0
>>> MachineAttrCpus0 = 1
>>> Owner = "ubuntu"
>>> LastJobStatus = 2
>>> MemoryUsage = ( ( ResidentSetSize + 1023 ) / 1024 )
>>> WhenToTransferOutput = "ON_EXIT"
>>> EnteredCurrentStatus = 1364322130
>>> LastJobLeaseRenewal = 1364322130
>>> PeriodicHold = false
>>> AutoClusterId = 1
>>> JobCurrentStartExecutingDate = 1364322129
>>> BytesSent = 24849.000000
>>> JobPrio = 0
>>> RootDir = "/"
>>> PeriodicRelease = false
>>> NumJobMatches = 1
>>> LastMatchTime = 1364322127
>>> PeriodicRemove = false
>>> LeaveJobInQueue = true
>>> StreamOut = false
>>> CommittedSlotTime = 3.000000
>>> DiskUsage = 325
>>> AutoClusterAttrs = "JobUniverse,LastCheckpointPlatform,NumCkpts,jordan,Requirements,NiceUser,ConcurrencyLimits"
>>> ClusterId = 669
>>> CommittedTime = 3
>>> CompletionDate = 1364322130
>>> SpooledOutputFiles = "render_0.png"
>>> StartdPrincipal = "unauthenticated@unmapped/
10.194.169.234"
>>> JobCurrentStartTransferOutputDate = 1364322130
>>> TransferInput = "<url>"
>>> CumulativeSlotTime = 3.000000
>>> MyType = "Job"
>>> JobRunCount = 1
>>> LastRemoteHost = "<machine-ip>"
>>> StreamErr = false
>>> ResidentSetSize = 0
>>> ProcId = 23
>>> User = "ubuntu@<machine-ip>"
>>> ExitBySignal = false
>>> Arguments = "-b dolphin.blend -o //render_# -F PNG -x 1 -f $(Process)"
>>> ResidentSetSize_RAW = 0
>>> LastSuspensionTime = 0
>>> JobStatus = 4
>>> NumShadowStarts = 1
>>> OrigMaxHosts = 1
>>> MachineAttrSlotWeight0 = 1
>>> ImageSize_RAW = 14260
>>> ShouldTransferFiles = "YES"
>>> QDate = 1364321914
>>> TerminationPending = true
>>>
>>> On Tue, Mar 26, 2013 at 2:19 PM, Brian Bockelman <
bbockelm@xxxxxxxxxxx> wrote:
>>> Hi Jordan,
>>>
>>> What do the ClassAds you are submitting look like?
>>>
>>> Iwd should refer to a directory on the submit machine (or the spool directory, if you are using spooling). By default, Iwd is set to the $PWD of the submitting process.
>>>
>>> Brian
>>>
>>> On Mar 26, 2013, at 1:09 PM, Jordan Williamson <
jordan.williamson@xxxxxxxxxxx> wrote:
>>>
>>>> I'm trying to run some jobs using the python bindings for Condor 7.9.4. They keep being held because the "Iwd" classad seems to be required, but I can't find a general "default" value for it that would work on any execute machine (that is, if I set it to some hard-coded directory, it would error out on a machine that didn't have that exact directory structure).
>>>>
>>>> Is it possible to leave this classad out and let the execute nodes take care of it? (If so, I can't seem to find any classads that would enable this, and just leaving it out altogether produces errors) Is there a default value for Iwd that would enable this action? I've tried "/", "." and the directory it's being submitted from on the submit machine, but none of those worked.
>>>> _______________________________________________
>>>> HTCondor-users mailing list
>>>> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
>>>> subject: Unsubscribe
>>>> You can also unsubscribe by visiting
>>>>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>>
>>>> The archives can be found at:
>>>>
https://lists.cs.wisc.edu/archive/htcondor-users/
>>>
>>>
>>> _______________________________________________
>>> HTCondor-users mailing list
>>> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>
>>> The archives can be found at:
>>>
https://lists.cs.wisc.edu/archive/htcondor-users/
>>>
>>> _______________________________________________
>>> HTCondor-users mailing list
>>> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>>
>>> The archives can be found at:
>>>
https://lists.cs.wisc.edu/archive/htcondor-users/
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>>
https://lists.cs.wisc.edu/archive/htcondor-users/
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>>
https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
>
https://lists.cs.wisc.edu/archive/htcondor-users/
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
>
https://lists.cs.wisc.edu/archive/htcondor-users/