Hi,
as we use jupyter notebooks running in condor slots in production for a while now we need to get a bit of monitoring around this.
One of the bigger problems to come up with something decent is that the jupyterhub uses condor_rm to end the notebook once it is not needed anymore. This results in a condor_history entry with jobstatus == 3 which is considered to be a faulted job (which in fact in this case it is not). The other option is that the notebook job runs into the timelimit and gets removed by the periodic_remove_expression which is a bit more flexible to tweak presumably.
I would like the idea of having an option for condor_rm to influence the subsequent history-job-state.