Hi All,At NIKHEF we are seeing more and more GPU usage, and often those jobs have quite a long preprocessing stage to reformat the training inputs or other data files takes quite some time compared to the total runtime of the slot.
As a result of this we end up with slots that do request a GPU but don't actually use it for a big part of the claimed period what results in non optimal GPU usage.
Now copying this prepared data back to a network storage, and then copy it back to the scratch disk of the slot with a GPU is a bit waist full of network bandwidth.
So i was wondering if it would be possible to have in a DAG or some other way, a CPU intensive preproccessing job running on a node with a GPU, and later in the process attaching the GPU to this slot or having a way to have a internal copy between the two jobs.
Any other suggestions that would work with the current limitations of condor are also more then welcome, (by for example having a node local scratch and having some constraints the jobs run after each other, altho then you miss the cleanup that condor does of the scratch)
Emily Kooistra NIKHEF