Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] MPI jobs can not be run on Condor 6.8
- Date: Wed, 11 Mar 2009 16:50:25 +0800 (CST)
- From: tracy_luofengji <tracy_luofengji@xxxxxxx>
- Subject: Re: [Condor-users] MPI jobs can not be run on Condor 6.8
Dear Zhao Kun,
Thanks for you help. I forgot to uncomment some sentences in the condor_config.local, and now the MPI jobs can be executed. But now the problem is, all the mpi jobs stay in the state "running" all the time, and never finish or return. I have spent several hours on it, but I can not find the reason.
Any help will be appraciated.
Regards,
Tracy
在2009-03-11 12:33:43,zhaokun <zhaokun@xxxxxxxxxxxxx> 写道:
>Dear tracy_luofengji,
>
>
> In submit file, remove 3 lines. Make sure mp1script and helloworld can be found in your working machine.
>
>>>>should_transfer_files?=?yes
>>>>when_to_transfer_output?=?on_exit
>>>>transfer_input_files?=?/usr/local/helloworld
>
>
>
>
>
> Thanks.
> Zhaokun
> Beijing Hotsim Technology Co.,Ltd
> zhaokun@xxxxxxxxxxxxx
> 2009-03-11
>=======From 2009-03-11 11:45:17 =======
>
>>Dear Zhao Kun,
>>
>>Hello, thanks for your help. I have tested it again following your suggestion. Only one sentence like "job has been submitted..." in the log file, and no information in the output file and the error file.
>>
>>I have already used condor_status and condor_q -analyze before. The status of my worker node kept "Unclaimed" and the result of "condor_q -analyze" just told me:"1 job is rejected by unknown reasons".
>>
>>Do you have any suggestions? Any help will be appreciated.
>>
>>Thanks!
>>Tracy
>>
>>
>>
>>
>>在2009-03-11?11:37:22,zhaokun?<zhaokun@xxxxxxxxxxxxx>?写道:
>>>Dear?tracy_luofengji,
>>>
>>> Please?add?following?lines?in?your?submit?file
>>>????
>>> Log?=?test.log
>>> Output?=?test.out
>>> Error?=??test.err
>>>
>>>????you?may?find?what?happens?in?log?file?and?output?file?after?job?submited.
>>>
>>>? "condor_status"?and?"condor_q?-ana"?will?help?you?to?get?more?info.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks.
>>> Zhaokun
>>> ???Beijing?Hotsim?Technology?Co.,Ltd
>>> ???zhaokun@xxxxxxxxxxxxx
>>> 2009-03-11
>>>=======From?2009-03-11?11:29:21?=======
>>>
>>>>Dear?all,
>>>>I?used?Condor?6.8?and?mpich-1.2.7.?I?tested?it?on?2?nodes:?one?acts?as?master?and?the?other?acts?as?worker.?On?the?worker?node,?I?copied?the?content?of?the?file?$CONDOR_HOME/etc/examples/condor_config.local.dedicated.resource?to?the?file?/home/condor/condor_config.local,?and?changed?the?dedicated?schedular?to:
>>>>?
>>>>DedicatedScheduler?=?"DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxx"
>>>>?
>>>>Then?on?the?master?node,?I?created?a?submission?file?as?following:
>>>>
>>>>universe?=?parallel
>>>>executable?=?/usr/local/mp1script
>>>>arguments?=?/usr/local/helloworld
>>>>machine_count?=?1
>>>>should_transfer_files?=?yes
>>>>when_to_transfer_output?=?on_exit
>>>>transfer_input_files?=?/usr/local/helloworld
>>>>queue
>>>>?
>>>>When?I?submitted?the?job?to?condor,?the?job?always?kept?idle,?so?I?want?to?know?the?reason?for?it.
>>>>?
>>>>Thanks!
>>>>Regards,
>>>>Tracy
>>>>_______________________________________________
>>>>Condor-users?mailing?list
>>>>To?unsubscribe,?send?a?message?to?condor-users-request@xxxxxxxxxxx?with?a
>>>>subject:?Unsubscribe
>>>>You?can?also?unsubscribe?by?visiting
>>>>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>
>>>>The?archives?can?be?found?at:?
>>>>https://lists.cs.wisc.edu/archive/condor-users/
>>>>
>>>
>>>=?=?=?=?=?=?=?=?=?=?=?=?=?=?=?=?=?=?=?=
>>>_______________________________________________
>>>Condor-users?mailing?list
>>>To?unsubscribe,?send?a?message?to?condor-users-request@xxxxxxxxxxx?with?a
>>>subject:?Unsubscribe
>>>You?can?also?unsubscribe?by?visiting
>>>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>
>>>The?archives?can?be?found?at:?
>>>https://lists.cs.wisc.edu/archive/condor-users/
>>
>>_______________________________________________
>>Condor-users mailing list
>>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>>subject: Unsubscribe
>>You can also unsubscribe by visiting
>>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>>The archives can be found at:
>>https://lists.cs.wisc.edu/archive/condor-users/
>>
>
>= = = = = = = = = = = = = = = = = = = =
>_______________________________________________
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>The archives can be found at:
>https://lists.cs.wisc.edu/archive/condor-users/
网易邮箱,中国第一大电子邮件服务商