| Mailing List ArchivesAuthenticated access |  | ![[Computer Systems Lab]](http://www.cs.wisc.edu/pics/csl_logo.gif)  | 
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] dag job hung on bogus windows OpSys requirement
- Date: Fri, 20 Jun 2014 18:33:27 +0000
- From: "Rowe, Thomas" <rowet@xxxxxxxxxx>
- Subject: Re: [HTCondor-users] dag job hung on bogus windows OpSys requirement
Yes, the dag job itself is hung. It never starts running locally and so the .sub and .log files are empty.
Unfortunately this is an air-gapped network and I can't simply send you files/output.
"condor_q -l" indicates that, yes, the target operating system requirement is Winnt51, just as "condor_q -analyze" is complaining. The condor platform variable indicates Winnt51 x86, but it does that on all the other machines without problem. I'm assuming that's where the executable was compiled. A string handling bug is getting the target opsys requirement set incorrectly? If you can specify anything specific to check I will reply.
Is there any reason to think upgrading Condor might address this? Did you update the compiler used?
________________________________________
From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of R. Kent Wenger [wenger@xxxxxxxxxxx]
Sent: Friday, June 20, 2014 11:36 AM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] dag job hung on bogus windows OpSys requirement
On Thu, 19 Jun 2014, Rowe, Thomas wrote:
> I have a DAG job that works fine on condor 7.6.3 on three Win2008
> machines. On the fourth identical machine, when I submit this dag job it
> hangs in state "I" forever. None of the sub jobs start. "condor_q
> -analyze" reports that no slots match this job due to Target.OpSys ==
> WinNt51. No such OpSys requirement has been specified anywhere. Condor
> on this particular machine is coming up with this strange idea on its
> own.  condor_q also explains that output for scheduler universe jobs is
> meaningless, so I have no idea what the problem really is.
Are you saying that the condor_dagman job itself gets stuck in the "I"
state?  Or is this happening to a node job within the DAG?
If it's the former, could you please send the following:
* The output of 'condor_q -l <id>' (where <id> is the Condor ID of the
dagman job).
* The .dagman.log file (<dag file>.dagman.log).
* The .condor.sub file (<dag file>.condor.sub).
Kent Wenger
CHTC Team
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/