Thanks Mats, Dan, and Greg,
We were trying to count how many files each Condor job transferred/opened and could not justify the massive file descriptor requirement. Now that we understand each shadow process can open 40-50 file descriptors it is clear that 65K file descriptors is not enough for 2K concurrently running jobs. The RHEL defaults are an order of magnitude lower than 65K. Sounds like we will need to raise this another order of magnitude.
We regularly run 10K+ concurrent jobs from the same submit hosts with Grid Engine but the master/slave submission model is totally different. We will do some quick research regarding any pitfalls for raising the file descriptor count even higher and then proceed accordingly (all of our cluster frontends [20+]) have the same image so we need to be careful with any base OS config changes.