Hi All,
I am currently looking at migrating from our home grown distributed
computing software to HTCondor. Over the years, user have created
complex "job managers" written in C++ which are equivalent to
application specific DAGMan scripts. To reduce the burden on users
migrating to HTCondor we would like to provide an adaptor between a "job
manager" and HTCondor.
An example of a simple "Job Manager" is one which (all within the same
cluster):
1. Requests 1000 simulation jobs to be executed
2. When all 1000 simulation jobs are completed, creates a database and
loads the results into it
3. Does analysis on the results in the database and based on the
analysis requests further simulation jobs to be executed. All without
any user involvement.
From what I have read our options are:
1. Web Service: Write an adapter using the SOAP interface. I suspect
there is not enough feedback regarding when a job completes / fails.
2. DAGMan: Write an adapter that generates DAGMan scripts.
3. DRMAA: Write an adapter that submits and monitors jobs via the DRMAA API.
Can someone confirm if I am one the correct track?
Does anyone have any suggestions / words of wisdom for this kind of
requirement?
Further info:
- Windows based pool
- Job manager is a C++ DLL
- Looking at using the current stable release of HTCondor
- Jobs will run in the Vanilla Universe
- Jobs will need to be run under the submitters Active Directory credentials
Thanks Nick
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/