Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Xen
- Date: Tue, 27 Jan 2009 08:16:46 -0800
- From: Craig Holland <crhollan@xxxxxxxxx>
- Subject: Re: [Condor-users] Xen
When I was doing a 'condor_status -vm' I was only seeing one of my vm
servers. I found some config issues and this is now working. Thanks for
your attention while I work through this stuff.
Rgs,
craig
On 1/27/09 6:02 AM, "Matthew Farrellee" <matt@xxxxxxxxxx> wrote:
> What do you mean by "condor servers seeing each other as vm universe hosts"?
>
> Best,
>
>
> matt
>
> Craig Holland wrote:
>> Jaeyoung,
>>
>> Thanks, the problem was indeed permissions on my disk image. I have been
>> able to submit my xen job and have it deployed into the vm universe. Still
>> having problems with the condor servers seeing each other as vm universe
>> hosts...will look into that today.
>>
>>
>> Thanks,
>> craig
>>
>>
>> On 1/25/09 2:44 AM, "Jaeyoung Yoon" <jaeyoungyoon@xxxxxxxxx> wrote:
>>
>>> Hello Craig,
>>>
>>> I think you need to check whether /var/lib/xen/images/test2-disk0
>>> exists on an execute machine. Otherwise, you need to specify
>>> "xen_transfer_files = /var/lib/xen/images/test2-disk0" in your submit
>>> file to transfer the disk file.
>>>
>>> Please refer to section "2.11.1.2 Xen-Specific Submit Commands" in
>>> Condor manual.
>>> "If any files need to be transferred from the submit machine to the
>>> machine where the vm universe job will execute, Condor must be
>>> explicitly told to do so with the xen_transfer_files command: "
>>>
>>> You also can see what is problem from condor vm-gahp log file in
>>> Condor log directory.
>>>
>>> Regards,
>>> -jaeyoung
>>>
>>>
>>> On Fri, Jan 23, 2009 at 2:43 PM, Craig Holland <crhollan@xxxxxxxxx> wrote:
>>>> Nevermind on that....I see this in my logs:
>>>>
>>>> 012 (014.000.000) 08/13 18:34:20 Job was held.
>>>> Error from starter on slot1@xxxxxxxxxxxxxxxxxxxxx:
>>>> VMGAHP_ERR_JOBCLASSAD_XEN_INVALID_DISK_PARAM
>>>> Code 6 Subcode 0
>>>>
>>>> ....and ideas?
>>>>
>>>> Thanks,
>>>> craig
>>>>
>>>>
>>>> On 1/23/09 2:34 PM, "Craig Holland" <crhollan@xxxxxxxxx> wrote:
>>>>
>>>>> Thanks Matt.
>>>>>
>>>>> So, I've gotten a bit further down the road. I'm able to submit the job
>>>>> with
>>>>> the file below but it seems to get held. I'm thinking there needs to be
>>>>> something that points to the domu config file in /etc/xen....but I don't
>>>>> see
>>>>> any reference to that. Certainly executing condor_vm_xen.sh from the
>>>>> command
>>>>> line requires the domu control file to be passed in. I tried using the
>>>>> executable key but that didn't seem to help.
>>>>>
>>>>> universe = vm
>>>>> vm_type = xen
>>>>> vm_memory = 512
>>>>> vm_networking = true
>>>>> executable = test2
>>>>> xen_disk = /var/lib/xen/images/test2-disk0:xvda:w
>>>>> xen_kernel = included
>>>>> queue
>>>>>
>>>>> Thanks,
>>>>> craig
>>>>>
>>>>> On 1/23/09 1:34 PM, "Matthew Farrellee" <matt@xxxxxxxxxx> wrote:
>>>>>
>>>>>> When you've configured some machines in your pool to support the VM
>>>>>> Universe you should be able to see them by running: condor_status -vm
>>>>>>
>>>>>> When you submit a VM Universe job it will be matched with one of those
>>>>>> machines. condor_vm_xen.sh will then be run on the matched machine to
>>>>>> start the VM. condor_vm_xen.sh is just a utility Condor uses to start
>>>>>> the VM, it isn't intended to be used manually.
>>>>>>
>>>>>> * * *
>>>>>>
>>>>>> Ugh. condor_vm_xen.sh is in sbin. It shouldn't be. It belongs in libexec.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>>
>>>>>> matt
>>>>>>
>>>>>> Craig Holland wrote:
>>>>>>> Thanks.
>>>>>>>
>>>>>>> So I've been using condor_vm_xen.sh to create the domu. This just seems
>>>>>>> to
>>>>>>> run it on the local host. Is this the correct method? Also, for some
>>>>>>> reason, my condor hosts don't see either other in the vm universe, but
>>>>>>> do
>>>>>>> see each other when I do a condor_status.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> craig
>>>>>>>
>>>>>>>
>>>>>>> On 1/23/09 11:16 AM, "Matthew Farrellee" <matt@xxxxxxxxxx> wrote:
>>>>>>>
>>>>>>>> Craig,
>>>>>>>>
>>>>>>>> Your vision is pretty accurate.
>>>>>>>>
>>>>>>>> Essentially, a disk image becomes your job. You submit it, Condor finds
>>>>>>>> a place for it to run. It runs. When it is done, it shuts itself down.
>>>>>>>>
>>>>>>>> The life cycle for the VM Universe job is the life cycle for the VM. I
>>>>>>>> avoid talking about DomU, because this would apply to KVM VMs as well
>>>>>>>> as
>>>>>>>> EC2 AMIs, if you're using the Grid Universe and EC2 resources.
>>>>>>>>
>>>>>>>> Some uses: 1) checkpoint & migration without Standard Universe; 2) job
>>>>>>>> portability - the disk contains everything needed for the job; 3)
>>>>>>>> ability to use Condor's policies and robustness to manage services; 4)
>>>>>>>> ability to use glide-in concept across VM clusters
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>>
>>>>>>>> matt
>>>>>>>>
>>>>>>>> Craig Holland wrote:
>>>>>>>>> I think I'm talking about the vm universe. I'm envisioning sending a
>>>>>>>>> xen
>>>>>>>>> domu into the grid as a job. I've been able to create the vm
>>>>>>>>> universe,
>>>>>>>>> but
>>>>>>>>> it seems like when a domu is created, it is tied to a specific dom0
>>>>>>>>> (which
>>>>>>>>>>>> I
>>>>>>>>> guess makes sense). And, once it is created, it isn't really clear to
>>>>>>>>> me
>>>>>>>>> what the benefit of running it in the vm universe is. BTW: I'm new to
>>>>>>>>> condor ;)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> craig
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 1/22/09 6:52 PM, "Steven Timm" <timm@xxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>>> Your question "the domU actually lives on the grid" isn't
>>>>>>>>>> very well defined as to what you mean by "living on the grid". Are
>>>>>>>>>> you
>>>>>>>>>> talking about virtual machine universe,
>>>>>>>>>> or just using Xen VM's as compute resources and running normal condor
>>>>>>>>>> jobs? Both can be done. We are doing the latter--using Xen VM's as
>>>>>>>>>> regular machines in the condor pool, including for
>>>>>>>>>> collector/negotiator
>>>>>>>>>> and the schedd's.
>>>>>>>>>>
>>>>>>>>>> Steve Timm
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, 22 Jan 2009, Craig Holland wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I recently started playing with Xen in Condore. It isn't clear from
>>>>>>>>>>> the
>>>>>>>>>>> documentation how this works - if the domu actually lives on the
>>>>>>>>>>> grid
>>>>>>>>>>> or
>>>>>>>>>>> if
>>>>>>>>>>> it can use the grid's resources. It would seem the latter. Can
>>>>>>>>>>> anyone
>>>>>>>>>>> point me to some useful reading on the subject or fill me in?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> craig
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Condor-users mailing list
>>>>>>>>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
>>>>>>>>>>> with
>>>>>>>>>>> a
>>>>>>>>>>> subject: Unsubscribe
>>>>>>>>>>> You can also unsubscribe by visiting
>>>>>>>>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>>>>>>>>
>>>>>>>>>>> The archives can be found at:
>>>>>>>>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>> Steven C. Timm, Ph.D (630) 840-8525
>>>>>>>>>> timm@xxxxxxxx http://home.fnal.gov/~timm/
>>>>>>>>>> Fermilab Computing Division, Scientific Computing Facilities,
>>>>>>>>>> Grid Facilities Department, FermiGrid Services Group, Assistant Group
>>>>>>>>>> Leader.
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Condor-users mailing list
>>>>>>>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
>>>>>>>>>> with
>>>>>>>>>> a
>>>>>>>>>> subject: Unsubscribe
>>>>>>>>>> You can also unsubscribe by visiting
>>>>>>>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>>>>>>>
>>>>>>>>>> The archives can be found at:
>>>>>>>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Craig Holland
>>>>>>>>> Mgr, Operations
>>>>>>>>> Cisco Media Solutions Group
>>>>>>>>> M: +1-650-787-7241
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Condor-users mailing list
>>>>>>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx
>>>>>>>>> with
>>>>>>>>> a
>>>>>>>>> subject: Unsubscribe
>>>>>>>>> You can also unsubscribe by visiting
>>>>>>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>>>>>>
>>>>>>>>> The archives can be found at:
>>>>>>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>>>>>> _______________________________________________
>>>>>>>> Condor-users mailing list
>>>>>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>>>>>>>> a
>>>>>>>> subject: Unsubscribe
>>>>>>>> You can also unsubscribe by visiting
>>>>>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>>>>>
>>>>>>>> The archives can be found at:
>>>>>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Craig Holland
>>>>>>> Mgr, Operations
>>>>>>> Cisco Media Solutions Group
>>>>>>> M: +1-650-787-7241
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Condor-users mailing list
>>>>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>>>>>>> a
>>>>>>> subject: Unsubscribe
>>>>>>> You can also unsubscribe by visiting
>>>>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>>>>
>>>>>>> The archives can be found at:
>>>>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>>>> _______________________________________________
>>>>>> Condor-users mailing list
>>>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>>>>>> subject: Unsubscribe
>>>>>> You can also unsubscribe by visiting
>>>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>>>
>>>>>> The archives can be found at:
>>>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Craig Holland
>>>>> Mgr, Operations
>>>>> Cisco Media Solutions Group
>>>>> M: +1-650-787-7241
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Craig Holland
>>>> Mgr, Operations
>>>> Cisco Media Solutions Group
>>>> M: +1-650-787-7241
>>>>
>>>>
>>>> _______________________________________________
>>>> Condor-users mailing list
>>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>>>> subject: Unsubscribe
>>>> You can also unsubscribe by visiting
>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>
>>>> The archives can be found at:
>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>>
>>> _______________________________________________
>>> Condor-users mailing list
>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/condor-users/
>>
>>
>>
>>
>> --
>> Craig Holland
>> Mgr, Operations
>> Cisco Media Solutions Group
>> M: +1-650-787-7241
>>
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
--
Craig Holland
Mgr, Operations
Cisco Media Solutions Group
M: +1-650-787-7241