Hi John, We, at UAB are trying to set up a generic condor environment as a pilot project. My experience with Condor so far has been in trying to: 1. migrate sge job as a condor workflow 2. set up a personal condor environment Following U of Wisconsin's recommendation, I started by setting up a Personal Condor instance. This was very easy with the condor yum repos on a centos 6 vm. Just did a 'yum install condor' and 'condor start'. Did not even had to modify the config files. All my workflow migration and tests were performed on this personal condor. As a next step, I tried to set up a Personal Condor pool (with 2 linux vms - debian, centos). During this process, we got to understand that 3 things were critical to have a running condor pool: 1. ALLOW_WRITE and ALLOW_READ settings in condor_config.local - need to explicitly mention the host names/ip address of machines in the pool - should have the ip address of the machine in /etc/hosts, just having 127.0.0.1 won't work - we had success resulting from turning off the firewall in the private environment, i.e., on the 2 vms. - we are currently faced with firewall problems, or more precisely NAT problems, in our cross-campus test pool. We are exploring flocking as a potential solution. * the above firewall problem is not the firewall on the condor host (collector) but in trying to schedule jobs to compute nodes that are behind a firewall On the nitty-gritty side, things I found extremely helpful in trying to migrate the sge job * to query class ads and to trouble shoot job errors- condor_status -l, condor_q -analyze * specify executables for more than one platform with the $$(OpSys)) class-ad attribute (Thanks to Brooklin Gore from UW). * macro expansion at submit time - to send a platform-specific file at submission time - Thanks, Poornima. On Dec 13, 2011, at 9:18 AM, Todd Tannenbaum wrote:
|
Attachment:
smime.p7s
Description: S/MIME cryptographic signature