[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: thanks and a long question
hi steven,
the latest.
1. i put the scripts you sent in /usr/lib/beoboot/bin and restarted
beowulf.
2. i powered down the slaves 0-2 (actually our admin person did this.
i'm
at jefferson lab today and tomorrow.) and brought them back up.
3. slaves 0,2 came up with an error and slave 1 did not come back at
all.
i don't know if this is related to the scripts or not. i will be back at
richmond tomorrow evening and try to bring things back myself.
you said you would modify the node_up script. is that the one you sent
me?
jerry
steven james wrote:
>
> Greetings,
>
> The reboot of the master is not actually necessary. Instead, you can just
> do:
>
> /etc/init.d/beowulf restart
> on the master and reboot the slaves. Note that the restart command will
> crash any running jobs on the cluster (of course, so does rebooting the
> master :-)
>
> For item 4, it may be the size of the library at issue, or it may be
> confused by the number of library paths. I have seen that before (in
> particular w/ the Intel compiler libraries). It may be that I will need to
> modify the node_up script to preload /usr/root/PRO/lib. I will be happy to
> take care of that.
>
> Alternatively, placing the attached scripts into
> /usr/lib/beoboot/bin (make sure to chmod +x the scripts) should cause the
> nodes to preload the needed library and make sure they can find them.
>
> The instructions for running X should not be necessary. I suppose since
> the X libs are linked against, they get loaded even when the command
> options say don't use X.
>
> Hope the eveninng beer was good (he says over the half-pot sized cup of
> morning coffee).
>
> G'day,
> sjames
>
--
Dr. Gerard P. Gilfoyle
Physics Department e-mail: ggilfoyl@richmond.edu
University of Richmond, VA 23173 phone: 804-289-8255
USA fax: 804-289-8482