hi janice, since you were instrumental in being part of the 'remote reboot system' for the cluster i thought you would like to hear that it appears to be working. jerry -- Dr. Gerard P. Gilfoyle Physics Department e-mail: ggilfoyl@richmond.edu University of Richmond, VA 23173 phone: 804-289-8255 USA fax: 804-289-8482
--- Begin Message ---
- To: Mike Vineyard <vineyarm@union.edu>, Luninita Tudor <luminita@jlab.org>, steven james <pyro@linuxlabs.com>, Markus Geiger <markus@linuxlabs.com>, Francisco Chinchilla <fchinchi@richmond.edu>
- Subject: status of the Richmond cluster and thanks to Steven James
- From: gilfoyle <ggilfoyl@richmond.edu>
- Date: Thu, 05 Dec 2002 14:18:39 -0500
Status of the Richmond cluster: It works! Yesterday I did the first full analysis using the entire cluster. It involved processing 148 data runs consisting of 1440 files. This is the E5, 4.232 GeV data set. It took about 12 hours to complete and processed 71790968 events. Of those events, 10384406 were ep events and 911913 were ep(n) events. I don't know how this compares with the performance of the JLab farm, but if anyone knows, please tell me. I want to thank Steven James for all his help in upgrading the cluster and solving the long string of problems we encountered in getting to this point. I have attached copies of the two perl scripts that I used to perform this analysis run. They are heavily commented so I won't bore you with the details here. If you see ways to improve them, let me know. One important point to realize is that with this many experimental runs to process, I had to build into my script the ability to wait until a slave node becomes available. In our previous work, we did not pay attention to this. As a result, I ended up filling up the /var area with stuff which caused the remaining jobs to fail or not even be started. Be aware of this 'feature' if your jobs mysteriously disappear. There are undoubtedly other bugs, problems, etc to solve. Please start doing your analyses so that we can find and fix those problems. The cluster run I did yesterday is NOT the full 4.232 GeV data set, but it is a large fraction of it. There are bunches of data files that I have yet to move over since they were pulled off the silo after I first moved this data set to the cluster. I will be generating a webpage in the next few weeks to keep documentation, notes, advice, etc about using the cluster. later, jerry -- Dr. Gerard P. Gilfoyle Physics Department e-mail: ggilfoyl@richmond.edu University of Richmond, VA 23173 phone: 804-289-8255 USA fax: 804-289-8482Attachment: submit_eod3d.pl
Description: Perl programAttachment: run_root_on_node3.pl
Description: Perl program
--- End Message ---