[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Fwd: status of the Richmond cluster and thanks to Steven James]



hi janice,

   since you were instrumental in being part of the 
'remote reboot system' for the cluster i thought you 
would like to hear that it appears to be working.

jerry

-- 
Dr. Gerard P. Gilfoyle
Physics Department                e-mail: ggilfoyl@richmond.edu
University of Richmond, VA 23173  phone:  804-289-8255
USA                               fax:    804-289-8482
--- Begin Message ---
Status of the Richmond cluster:

It works! Yesterday I did the first full analysis using the entire 
cluster. It involved processing 148 data runs consisting of 1440 files. 
This is the E5, 4.232 GeV data set. It took about 12 hours to complete 
and processed 71790968 events. Of those events, 10384406 were ep 
events and 911913 were ep(n) events. I don't know how this compares 
with the performance of the JLab farm, but if anyone knows, please 
tell me. I want to thank Steven James for all his help in upgrading 
the cluster and solving the long string of problems we encountered in 
getting to this point.

I have attached copies of the two perl scripts that I used to perform 
this analysis run. They are heavily commented so I won't bore you with 
the details here. If you see ways to improve them, let me know. One 
important point to realize is that with this many experimental runs to 
process, I had to build into my script the ability to wait until a 
slave node becomes available. In our previous work, we did not pay 
attention to this. As a result, I ended up filling up the /var area 
with stuff which caused the remaining jobs to fail or not even be 
started. Be aware of this 'feature' if your jobs mysteriously 
disappear.

There are undoubtedly other bugs, problems, etc to solve. Please start 
doing your analyses so that we can find and fix those problems.

The cluster run I did yesterday is NOT the full 4.232 GeV data set,
but it is a large fraction of it. There are bunches of data files that 
I have yet to move over since they were pulled off the silo after I
first moved this data set to the cluster.

I will be generating a webpage in the next few weeks to keep 
documentation, notes, advice, etc about using the cluster.


later, 

jerry

-- 
Dr. Gerard P. Gilfoyle
Physics Department                e-mail: ggilfoyl@richmond.edu
University of Richmond, VA 23173  phone:  804-289-8255
USA                               fax:    804-289-8482

Attachment: submit_eod3d.pl
Description: Perl program

Attachment: run_root_on_node3.pl
Description: Perl program


--- End Message ---