| Mailing List ArchivesAuthenticated access |  | ![[Computer Systems Lab]](http://www.cs.wisc.edu/pics/csl_logo.gif)  | 
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] problems using transfer_output_remaps
- Date: Thu, 19 Jan 2006 15:20:11 -0800
- From: Adam Lathers <alathers@xxxxxxxxxxxxxx>
- Subject: [Condor-users] problems using transfer_output_remaps
Hi all,
	I'm having some issues using the transfer_output_remaps option in a  
submit file.  Specifically, I'm submitting a DAG as a proof of  
concept to work out the bugs before implementing a similar solution  
for our big data processing codes.  Essentially, the layout of our  
architecture looks something like this.  Our pool manager host  
(schedd, collector, negotiator), exists "outside" our trusted realm,  
so it has no access to our shared filesystem.  All the worker nodes  
exist inside the trusted realm, and all share a filesystem.  (Yes, I  
know there are some security paradigm issues there, but I can't solve  
those presently).  What I do need to deal with is, the data we will  
be working with is "big"...total in and out data is something in the  
order of 100GB presently, and presently, it's not segmented into  
"small" pieces, so each worker node, were it to ship the input data,  
would have to grab a 20-50GB dataset before processing started.
	My goal in the short term is basically this.  I'd like to rely on  
the shared file system, and just "mimic" what I need to on the submit  
node.  Thus far, this works, but to make it happen, I need to  
duplicate a directory structure on the submit node to look just like  
the worker nodes.  What I'd "prefer" to do is leverage the  
transfer_output_remaps option, so that when logs and output and such  
get shipped back to the submit machine, it just goes into a single  
large log directory, with some sort of intelligent naming mechanism.
an example submit that I've tried looks something like this.
(note, for the transfer_output_remaps, I've also tried just naming  
A.err and so on.  Maybe I just missed the proper permutation?)
Universe        = vanilla
Executable      = /home/alathers/condor_matlab/condor_test/matlab.sh
InitialDir      = /home/alathers/condor_matlab/condor_test
Error           = /home/alathers/condor_matlab/condor_test_submitdir/ 
A.err
Log             = /home/alathers/condor_matlab/condor_test_submitdir/ 
A.log
transfer_output_remaps = "/home/alathers/condor_matlab/ 
condor_test_submitdir/A.err = /home/alathers/condor_matlab/logs/A.err"
GetEnv          = true
Arguments	= A
Requirements 	= FileSystemDomain == "ncmir.ucsd.edu"
Notification    = Error
Notify_user     = alathers@xxxxxxxxxxxxxx
Queue
	In the end, when the job finishes, the .log and .err files are sent  
back to the submit node, and put in /home/alathers/condor_matlab/ 
condor_test_submitdir/
	I'm sure I'm forgetting some vital piece of info, so please feel  
free to let me know.  Any thoughts, or insight would be REALLY  
appreciated.  As noted, I know there are a LOT of problems with the  
present approach, but for various reasons my role is to solve this  
step first, before redesigning the process.  Thanx everyone.
_______________________________________________________
Adam Lathers
NCMIR: National Center for Microscopy and Imaging Research
Distributed Systems Engineer
phone: (858) 822-0735
fax:   (858) 822-0828
web:   http://ncmir.ucsd.edu