[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: even more questions about the Richmond cluster



hi steven,

   thanks for the quick response. a couple more questions. i cc'ed mike
vineyard so he hears the latest.

1. can we just turn off sending mail to the user? it would be painful if
i got 160 emails every time i do an analysis run.

2. i have heard that sendmail has some security problems. is that true?
otherwise i have no great problem running it beyond point #1.

3. i've already symlinked /var/spool/at/spool to /data3/spool so it has 
plenty of space. it seems like an easy fix to do the same with 
/var/spool/mqueue.

4. running one job per compute node is fine.

i've also noticed a couple of other things. using beomap to allocate
nodes
creates the following 'problem'. by compiling root we gained a very
large (83)
factor in speed. this means the many of the runs get analyzed in a few
minutes thus freeing up that node. with the long sleep time i'm using
between
job submissions, the low-number slave nodes are usually done before we 
get to the high-number nodes. on the analysis run this morning i did not
use any node about 14. this also means there is lots of data dumped on
the
first few nodes (about 11 GBytes in several cases). this has the danger
of
filling up the slave's disk (which holds 18 GByte). perhaps picking the
slave nodes 'by hand' as i do in one of my scripts is a better way to
distrubute the jobs??

Steven James wrote:
> 
> Greetings,
> 
> Just for variety, I'll answer the questions from easy to hard order :-)
> 
> /var/spool/mqueue filling:
> var/spool/mqueue is where outgoing mail goes for sendmail to process (even if
> it's going from and to a local user).
> 
> Currently, sendmail isn't running on pscm1, so the mail piles up in mqueue
> rather than being delivered to user mailbox files in /var/spool/mail.
> 
> /var/spool/at is where the batch system holds job output until completion. Upon
> completion, it will mail the output to the job's owner.
> 
> Probably, sendmail should be running on the master.
> 
> To avoid overfilling /var, /var/spool can either be symlinked to a larger volume
> (/usr or to one of the RAID volumes) or have your batch scripts redirect the
> output to a scratch file and then move (or bpcp) it to the user's home directory
> when the run is complete.
> 
> I am currently tracking down a bug in the batch queue that pushes jobs off to
> the b queue and leaves them there when they can't be immediatly dispatched. I
> believe there is a workaround for that, I'll dig that up and send it to you. The
> ultimate solution will be an update to the at rpm.
> 
> Changing the loadaverage threshold won't do anything here. The bproc mods to the
> batch queue system use a 'slot allocation' rather than loadavg. The slot
> allocator assumes that the best allocation is 1 job per CPU on a compute node.
> 
> That is done for three reasons. First, that's generally a better allocation for
> any scientific computation while loadavg is better suited for general purpose
> servers. Second, loadavg is more expensive to compute in that it will generate
> extra net traffic and disrupt the cache and take extra locks on the compute
> nodes (which will hurt performance). Compute jobs often have bursts of I/O where
> the job will sleep waiting, followed by long periods of pure computation. That
> can cause a brief dip in apparent loadavg and lead to bad allocations.
> 
> G'day,
> sjames
> 
> Quoting gilfoyle <ggilfoyl@mindspring.com>:
> 
> > Hi Steven,
> >
> >    Happy New Year and yet another question about the Richmond cluster.
> > I have been experimenting with different ways of running the cluster
> > and I have run into a problem with the batch system. I'm submitting
> > jobs in two different ways; one uses the beomap command to allocate
> > slave nodes and the other just picks the slave nodes `by hand'. I'm
> > using this second method because the limiting factor now is the
> > ability to transfer the data files to the slave nodes. I was thinking
> > that I could transfer the data on the first pass and leave it there
> > for later passes to speed things up. The problem now is that after
> > many jobs are submitted (from 60-100 of so) the remaining jobs get
> > sent to the `b' batch queue and never run. This has happened even when
> > the /var/spool area is not full. My thoughts are the following.
> >
> > 1. Can we reset the average cpu load with the 'atd -l' command? I've
> > tried this and it seems to have little effect.
> >
> > 2. Can we restart the jobs in the queue? Now they just sit there and
> > never get started.
> >
> > 3. In some of the recent analysis runs, the /var/spool/mqueue area has
> > filled up and hung things up. Before it was the /var/spool/at or
> > /var/spool/mail areas. Do you have any idea what would cause that?
> > Should we make a link for /var/spool/mqueue to one of RAID disks so
> > there is plenty of space?
> >
> > Let me know what you think?
> >
> > Thanks-in-advance,
> >
> > jerry
> >
> > --
> > Dr. Gerard P. Gilfoyle
> > Physics Department                e-mail: ggilfoyl@richmond.edu
> > University of Richmond, VA 23173  phone:  804-289-8255
> > USA                               fax:    804-289-8482
> >
> 
> ----------------------------steven james, director of research, linux labs
> 
> LinuxBIOS Cluster Solutions                   230 peachtree st nw ste 2705
> 
> High-Speed Colocation, Hosting,                        atlanta.ga.us 30303
> 
> Linux Hardware, Development & Support             http://www.linuxlabs.com
> 
> * Visit us at SuperComputing 2002, Booth 1441 *   office/fax 404.577.7747/3
> 
> --------------------------------------------------------------------------

-- 
Dr. Gerard P. Gilfoyle
Physics Department                e-mail: ggilfoyl@richmond.edu
University of Richmond, VA 23173  phone:  804-289-8255
USA                               fax:    804-289-8482