I have been trying to set up condor with matlab for quite a
while. I have simplified my matlab code that I am trying to run and am still
having difficulty get the results. It seems as if Condor is returning results
before matlab has a chance to finish completing the tasks. I am using the
matlab distributed computing engine but am just asking for help from anyone
that is familiar with matlab. I am going to paste my code below and was
wondering if someone could take a look at it and see if there seems to be any
glaring mistakes that my team and I can not figure out. First, I have a m file named trial.m. this
is the code that I want to run and get results from. function trial() jm = findResource('scheduler','configuration', 'generic'); set(jm,'configuration','generic'); job1 = createJob(jm); createTask(job1, @sum, 1, {[1 1]}); createTask(job1, @sum, 1, {[2 2]}); createTask(job1, @sum, 1, {[3 3]}); createTask(job1, @sum, 1, {[4 4]}); submit(job1); waitForState(job1, 'finished', 60) results = getAllOutputArguments(job1) Next, is my submit function through matlab function submitfcn(scheduler, job, props,
extraCondorSubmitArgs) %#ok Not using job %SUBMITFCN Submit a Matlab job to a Condor scheduler % % See also workerDecodeFunc. % % Assign the relevant values to environment variables,
starting % with identifying the decode function to be run by the
worker: decodeFcn = 'workerDecodeFunc'; if nargin < 4 extraCondorSubmitArgs = ''; end % Ask the workers to print debug messages by default by
setting MDCE_DEBUG to % true. jobEnvVars = {'MDCE_DECODE_FUNCTION',decodeFcn, ...
...%' MDCE_STORAGE_LOCATION',props.StorageLocation, ...
' MDCE_STORAGE_CONSTRUCTOR',props.StorageConstructor, ...
' MDCE_JOB_LOCATION',props.JobLocation, ...
' MDCE_DEBUG','true'}; taskEnvVars = cell(1, numel(props.TaskLocations)); for i = 1:numel(props.TaskLocations) taskEnvVars{i} = {'MDCE_TASK_LOCATION',
props.TaskLocations{i}}; end if isempty(scheduler.ClusterMatlabRoot)
warning('distcomp:condor:NoClusterMatlabRoot', ...
['The scheduler''s ClusterMatlabRoot property is empty.\n', ...
'Using matlabroot instead.']); clusterMatlabRoot = matlabroot; else clusterMatlabRoot =
scheduler.ClusterMatlabRoot; end matlabScript = fullfile(clusterMatlabRoot, 'bin', 'matlab'); % ... Do we need the following ??? ... if ispc matlabScript = [matlabScript, '.bat']; end matlabArgs = strrep(scheduler.matlabCommandToRun, 'matlab ',
''); % Determine where to save the standard output, standard
error and the % Condor log. logFiles = cell(1, props.NumberOfTasks); outFiles = cell(1, props.NumberOfTasks); errFiles = cell(1, props.NumberOfTasks); for i = 1:props.NumberOfTasks taskLoc =
fullfile(scheduler.DataLocation, props.TaskLocations{i}); logFiles{i} = [taskLoc, '.log']; outFiles{i} = [taskLoc, '.out']; errFiles{i} = [taskLoc, '.err']; end % Create one condor submit file for all the tasks. script = createCondorSubmitScript(matlabScript, matlabArgs,
...
jobEnvVars, taskEnvVars, ...
errFiles, outFiles, logFiles); % Submit a Condor job that executes all the tasks: ...%[pathstr, name, ext, versn] = fileparts(script); ...% script2 = name; % Execute the submit command on the
remote host. %copyfile(script, '.') condorSubmitCommand = ['condor_submit ', script, ' ',
extraCondorSubmitArgs]; [s, w] = system(condorSubmitCommand); % Leave behind the necessary debugging information if the submission
failed. if s ~= 0 warning('distcomp:condor:SubmitFailed',
...
['Call to condor_submit failed with the following message:\n\n', ...
' %s\n\n', ...
'The submit command used was:\n\n %s\n\n', ... 'Not
deleting the submission file %s.'], ...
w, condorSubmitCommand, script2); else % Display the Condor job number: disp(w); % Clean up: delete(script); % delete(script2); end function filename = createCondorSubmitScript(matlabScript,
matlabArgs, jobEnvVars, taskEnvVars, errFiles, outFiles, logFiles) %Create a Condor submit script that forwards the correct
environment variables %and executes Matlab. % We assume that the decode function has been put on the
path of the MATLAB % workers, e.g. by putting it into
$MATLABROOT/toolbox/local. % Double all backslashes so fprintf prints out a single
backslash. matlabScript = strrep(matlabScript, '\', '\\'); matlabArgs = strrep(matlabArgs, '\', '\\'); jobEnvVars = strrep(jobEnvVars, '\', '\\'); for i = 1:numel(taskEnvVars) taskEnvVars{i} = strrep(taskEnvVars{i},
'\', '\\'); end outFiles = strrep(outFiles, '\', '\\'); errFiles = strrep(errFiles, '\', '\\'); logFiles = strrep(logFiles, '\', '\\'); condorHeader = [
'Universe =
vanilla\n', ...
'Executable =
condor_exec.bat \n',... %s\n', ...
'Transfer_Executable = true\n', ...
'Requirements = (machine ==
"condor01.integrity-apps.com")\n'
]; taskString =
['Arguments =
%s\n', ...
'Environment =
"%s"\n', ...
...%'input
= matlab_metadata.mat \n', ...
'Error
= %s\n', ...
'Output
= %s\n', ...
'Log
= %s\n', ...
'should_transfer_files = YES \n',...
'transfer_input_files = matlab_metadata.mat, job1.in.mat, job1.common.mat,
job1.state.mat, job1.out.mat \n', ...
'when_to_transfer_output = ON_EXIT \n',...
'notify_user =
jrichardson@xxxxxxxxxxxxxxxxxx \n',...
'Queue\n\n']; filename = tempname; fid = fopen(filename, 'wt'); fprintf(fid, condorHeader, matlabScript); for i = 1:numel(taskEnvVars) % Create a cell-array of all the
environment variables we want to set % for the current task, and transform it
into a string for the Condor % script. envString = createCondorEnvString({jobEnvVars{:},
taskEnvVars{i}{:}}); % Append a clause to the Condor script to
queue the current task. fprintf(fid, taskString, matlabArgs,
envString, errFiles{i}, outFiles{i}, logFiles{i}); end fclose(fid); function envString = createCondorEnvString(envVars) %envStr = createCondorEnvString(envVars) % envVars should be a cell arra of even length.
The even entries are % the environment variables, the odd entries are their
values. % In Condor, environment variables are specified in UNIX as % Environment = var1=val1;var2=val2;...varn=valn % and on Windows, the separator is '|' instead of ';', i.e.
the format is % Environment = var1=val1|var2=val2|...varn=valn if ispc envSep = ' '; else envSep = ';'; end envString = ''; for i = 1:2:numel(envVars) envString = [envString, envVars{i}, '=',
envVars{i + 1}, envSep]; end Now is my executable that is being passed
through condor. There might be a problem here. It calls worker.bat that starts
matlab.bat which starts matlab. My argument is worker.bat @echo off IF EXIST "C:\MATLAB_DISTRIBUTED\R2007a\bin\" (SET
DRIVELETTER=C) SET DL=%DRIVELETTER%:\MATLAB_DISTRIBUTED\R2007a\bin\ IF EXIST "G:\MATLAB_DISTRIBUTED\R2007a\bin\" (SET
DRIVELETTER=G) SET DL=%DRIVELETTER%:\MATLAB_DISTRIBUTED\R2007a\bin\ echo %DL% echo %DRIVELETTER% "is the DRIVELETTER " SET TEMP=%_CONDOR_SCRATCH_DIR% SET %DL%%1 SET Any suggestions will be greatly
appreciated. Currently, I receive a message saying that the job is completed
while tasks are still running and I get a result of an empty 4 x 0 array Josh Richardson Integrity Applications Incorporated Intern (IAI) 703-378-8672 ext 632 |