Apple-Xgrid邮件列表上关于环境变量的讨论
I am hoping someone  can point me in the right direction. Recently I set up a cluster of 6 XServe  nodes. I am trying to perform a series of Monte Carlo simulations on the  cluster, submitting the jobs via Xgrid (and am rather new at using Xgrid). The  code requires certain user defined environment variables to be set at run time.  I actually set these manually within the /etc/profile and etc/bashrc on each  node. The executable I am trying to run was compiled with g++ 3.3 and takes a  series of values at the command line as input. Every time I submit the code  throws an exception saying that an environment variable is not set. I am at a  loss of what to do.
 Recently I began  using GridStuffer for job submission. In an attempt to bypass the environment  variable problem I wrote a simple shell script which first sets the variable and  then calls the program with the command line input. Here it  is:
 ----------------------
 #!/bin/sh
 export  $G4LEDATA=/usr/local/Geant4.7.0/DataFiles/G4EMLOW2.3
 echo  $G4LEDATA
 ./proton true true  false ICRU-49p false false monoenergetic pencil 400 400.0 154.0 150.0 0.0 30.0  20.0 water water 1.0 1.0 1000
 ----------------------
 The name of the  script is then listed in the first and only line of a text file that I use as  input to GridStuffer.  I do indeed get the variable $G4LEDATA send to  stdout, but the program will not run beyond a certain point. There is nothing  written to stderr. If I don't set $G4LEDATA, the exception I mentioned above is  sent to stderr.
 Another thing I  tried was to copy the executable (proton) to each node, all in the same  directory. Then in the script I have /directory/proton true true  .....
 instead of ./proton  true true.... However, as far as I can tell the program never  executes.
 My apologies for  being rather verbose, but I am really stuck at this point. I openly acknowledge  my lack of knowledge with Xgrid and GridStuffer, and think that the problem is  in my not fully understanding how either really works. I have completely turned  off the all password authentification between the controller and agents since  the cluster is not online (completely stand-alone). A couple of questions  :
 (1) How does the  controller log into the agent to transfer files? I am assuming it is as a  generic user. Shouldn't the user have full access to environment variables  defined in /etc/profile? 
 (2) With  GridStuffer, is there a better way than what I am doing to submit the job? For  instance, using the directives -dirs and -files to force certain files to be  copied to the agent?
 Any help will be  greatly appreciated. 
 P.S. Charles, I hope  you see this because your help would be very beneficial..
 Thanks,
 Dan
 --------
 Dr. Dan J. Fry
 Physicist
 Henry M. Jackson Foundation For The  Advancement of Military Medicine
 Walter Reed Army Medical  Center
 6900 Georgia Avenue, NW
 Washington, DC 20307
 I did some testing with environment variables to make sure, but it seems quite certain that Xgrid will not load environment variables, or more precisely the shell won't, even if explicitely called using / bin/sh somewhere in the text. This is not too surprising.
The shell script approach you propose should work better for this purpose. If you really need to set up the environment from the information specific for the agent, you might alternatively read /etc/ profile manually to load the env var there in your script (not sure how to do that _exactly_).
Now, it seems your program still won't run "beyond a certain point" within the script. What exactly happens then? Your program only need to have $G4LEDATA? Or is another env var missing? Look for messages in the agent Console. Anyway, I would definitely go ahead with the script wrapping approach and iron out the other problems then, which might be different.
Regarding the GridStuffer format, you only need -files to explicitely force the addition of a file to the job (and you probably don't need - dirs). If 'proton' is in the same path as the input file, and if you don't need any other files to run the program, then you are fine, no - files needed, GridStuffer will figure it out. If you can have everything set up on the agent, even better, then use only full paths for the program and files in the job submission.
Finally, Xgrid agent will usually run as user 'nobody' (unless you are using Kerberos auth or you manually start the agent as a different user).
hope that helps,
charles


0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home