HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] How to embed Condor in an application server?



Hi all,

I came up with the question - "How to embed Condor in an application
server?" because I think I got a problem (described below) and I think
embedding is a good solution.  I want to ask if I am thinking
correctly?

I have a project which runs computational workload on a grid.
However, my workload runs inside a container inside the application
server, and it enjoys the infrastructure of the application server to
provide access to data and caching/sharing of the working data in RAM,
access to some enterprise APIs among other things.  Our application
server implements partitioning of workload, master-slave model of
grabbing workload from a basic work queue.  In the past, the work
queue can only implement very naive priority sorting.  There is no
ability to handle user quota, priority, preemption, etc..  We have
done some extensions by making an out-call from the application server
to submit a job into a real Job Queue System such as PBS.  In the
past, that has kind of served us well enough.  But we have always
struggled with the split personality - some workload are consumed by
master-slave model, consumed as quickly as possible inside the
application server; some workload are spitted out to another cluster.
We have never been able to plug a scheduler inside our application
server, and impose some queuing semantics so that when our slaves in
the master-slave model can enjoy some real scheduling when they ask
for the next job to work on.

I have considered for some time if I might embed condor into our
application server, which can host C++.  My initial target is whether
to port the "server sides", e.g. central manager, negotiator to our
C++ app server.  If that is something sensible and feasible, it will
do wonders to our setup already.  I can have nodes currently under PBS
run condor instead.  The "server sides" of condor live inside my app
server, enjoys high availability and can tightly integrate and control
condor pools.  Eventually I may even host some kind of "virtual condor
nodes" inside our application server.  It won't run "processes" like
the vanilla universe, but may be more like the Java universe instead.

I read about the Daemon Core, and found that it's the core library
that reads options and handles signals.  The paradigm of signals will
obviously need to change if I move things into the application server
space.

I wonder if someone can tell me if I am walking in the right
direction?  Have other people looked at a similar problem?

-Ken Young