[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Richmond cluster status



Greetings,

This is excellent!

The reason the cp rather than bpcp matters is that bpcp is unaware of nfs
mounts, so the data was going from fileserver to master via NFS, then from
master to slave via bpcp, so doubling the network traffic required.

NFS can behave badly on a busy network. The sleep was also intended to
reduce the number of simultaneous transfers, and appears to have helped a
great deal.

I hope you have a wonderful Thanksgiving.

G'day,
sjames



On Wed, 27 Nov 2002, gilfoyle wrote:

> hi steven,
> 
>    i made the following changes to my scripts.
> 
> 1. i had the script 'sleep' for 45 seconds after submitting a 
> batch job with the following command.
> 
>     system("batch -f run_job");
>     sleep 45;
> 
> 2. i changed the way i copied the data files over from this perl command
> 
>    system("bpcp /data2/e5/root/4.232/$data_filename
> $NODE:/scratch/gilfoyle/e5/$RUNNO/$data_filename");
> 
> to this one.
> 
>    system("bpsh $NODE cp /data2/e5/root/4.232/$data_filename
> $NODE:/scratch/gilfoyle/e5/$RUNNO/$data_filename");
> 
> does this really make a difference??
> 
> i tested things in the following ways.
> 
> 1. i ran my scripts on two data runs for a few events in each run. things
> worked.
> 
> 2. i ran them on 12 data runs for 250,000 events in each run. this is where
> pscm1 hung
> last week. THIS WORKED! I MAY BE ON THE VERGE OF BEING A HAPPY GUY.
> 
> 3. i ran my scripts on 50 data runs for 250,000 events in each run. this worked!
> 
> 4. i ran my scripts on 120 data runs for all the events in each run. this will
> take
> many hours, but it looks good so far.
> 
> 5. went to chicago for thanksgiving.
> 
> i'll be in touch next week.
> 
> 
> jerry
> 
> 

-- 
-------------------------steven james, director of research, linux labs
... ........ ..... ....                     230 peachtree st nw ste 701
the original linux labs                             atlanta.ga.us 30303
      -since 1995                              http://www.linuxlabs.com
                                   office 404.577.7747 fax 404.577.7743
-----------------------------------------------------------------------