Poring over logs reveals that jobs are trying to start on slots 10
through 32, but get killed immediately due to a 10054 error. It looks
like the user used to run these jobs ( condor-reuse-slot1_XX ) cannot
be created, thus resulting in permission errors. Windows usernames
appear to have a limit of 20chars, which looks like it's causing the
21-character condor username to fail ( condor-reuse-slot1_X is OK but
condor-reuse-slot1_XX is not ).
The current workaround we've employed is creating 4 partitionable
slots, each with 25% share of resources. Of course this means that
the maximum amount of ram etc. that any single job can use is more
limited than it would be using a single partitionable slot. Note,
this is not a problem on our linux machines.
Is this a known bug? Is there a better solution/workaroundi