Hi
there,
This
is my first post.
I've
been experimenting with a POC of a HTCondor (v8.8.1)
cluster on GCP (Centos7) to run a a c++ application that
delivers a
Monte Carlo simulation framework for contemporary
financial risk analytics and value adjustments.
I
have set up the cluster and can run simple tests on
it. It utilises a machine image as the base machine
for the cluster with the relevant compiled code (as
this process takes 1-2 hours to run).
Currently
the legacy application conducting this analysis can
take up to 8 hours and this POC is aiming to provide a
way to dramatically reduce the runtime and also
produce additional analyses.
The
current data/job flow I have been using doesn't work:
*
Submit job (and transfer credentials)
* Download analysis specification,
product portfolio and market input files from google
storage per counterparty (via gsutil)
*
Run c++ app with initial input files
*
Writes outputs (e.g. monte carlo simulation outputs)
to local dir
*
Upload back to google storage per counterparty
I have made sure I can run the app
on the condor-compute node if I SSH directly to it.
Some
questions ... perhaps you can help me understand how
HTCondor works?
* Can I specify under which user
the jobs run? Currently itâs running as the user
"nobody" and permissions are at least one of the
problems. Can I run as another user with the correct
permissions? I haven`t been able to find information
on this.
* Does HTCondor allow
what I am trying to do....create sub directories to
pull inputs from, then write to a directory then
upload to GCP? Most of the examples
Iâve read require passing all of the files at the
time of jobs submission.
* Finally, I tried to debug in
interactive mode as per a tutorial but receive the
notice "this account is not available" - I couldn't
find information on this.
Overall, likely both a permission issue with the
"nobody" user and perhaps an environment variable
issue.
Your
help most appreciated.
Regards,
Forde