All, >> Block a few machines; >> Use them for some jobs, say MPI jobs; I think the above is possible by just using Web-Service (WS) APIs. Firstly, do a “newCluster()” and then run the MPI Jobs one by one by using “newJob()” We can choose to wait manually for the MPI job to finish instead of trying to enforce dependencies via Condor. As long as the ClassAd of the subsequent MPI Jobs (machine count) remain same, the machines will be reserved only once. I don’t think We can achieve a similar effect with a Job Submit file because we cannot manually wait between processes belonging to same cluster. OR Is it possible to submit multiple jobs belonging to same cluster through multiple submit files by just hardcoding “ClusterId” parameter to a constant value? >> Release a subset (or) acquire more machines >> Use them for some more jobs >> Iterate like this before finally relinquishing all resources… May be, this is not possible. But it is just a question of calling a new cluster I guess and changing the machine_count/requirements of the JobAd. Please advice if my thought process is correct (or) Am I missing something here? Thanks, Best Regards, KN
The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. ---------------------------------------------------------------------------------------------------------------------------------------------------- |