[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: thanks and a long question



Greetings,

I am happy I could help.

A long question deserves a long answer, so here goes :-)

You were on the right track by putting the Xlib in with the regular libs.

The issue is that slave nodes recieve their files from the master's /lib
and /usr/lib directory.

This is a configuration option in /etc/beowulf/config.

Addint /usr/X11R6/lib there, then restarting the cluster should make it
find the library.

It is normally not included since it's unusual for a cluster app to want
to use X (other than the root process of a parallel visualization
app, that is). 

This is not necessarily a problem, just an unusual situation that needs
configuring. Depending on exactly how it does it's thing, you may also
need to set the DISPLAY environment variable explicitly to 192.168.1.1:0

It may also be necessary to use xhost to permit the node to use the
Xservices on the master. For your example of node 3, you would want 
xhost +n3 

before running root. (A useful note, the nss libs are patched so that
n<node_number> will correctly resolve to the node's IP address.

If you are wanting to have the X connection re-directed to a workstation
somewhere, we'll need to set the master up to forward outgoing connections
from compute nodes so the X connection can get through.

G'day,
sjames
 


On Wed, 6 Nov 2002, Gerard P. Gilfoyle wrote:

> Hi Steven,
> 
>    Thanks for all your help on Monday with the upgrade of the Richmond 
> cluster. I have spent yesterday and today getting all our software 
> tools up and running and I have run into a problem. We use a code 
> called root to analyze our physics data both interactively and in 
> batch. It was written at CERN (a large, international particle physics 
> lab in Europe). I can run root on the master (pscm1) in interactive 
> mode and in batch with no problems. However, when I try to run it in 
> batch on a cluster node it can't find a library. The commands and 
> error message are below.
> 
> running root in batch on master: 
> 
>        root -b -q run_eod3.C  <-- this works
> 
> The '-b' means batch and '-q' means the next thing is a file
> containing commands for the data analysis.
> 
> running root in batch on a slave 3: 
> 
>        bpsh 3 root -b -q /scratch/gilfoyle/e5/24028/run_eod3.C
> 
> error message from the previous command:
> 
> root: error while loading shared libraries: libXpm.so.4: cannot open 
> shared object file: No such file or directory
> 
> The library libXpm.so.4 is located in /usr/X11R6/lib/ on pscm1 so 
> presumably this is an environment variable problem. I have tried 
> various fixes, but all have failed. Some of the things I tried are 
> listed below.
> 
> 1. root uses a library whose location is defined by the environment 
> variable LD_LIBRARY_PATH which will point to an area like 
> /usr/root/lib/. I have tried adding /usr/X11R6/lib/ to this path and 
> even putting libXpm.so.4 in with the normal root libraries, but I get 
> the same failure.
> 
> 2. After the upgrade on Monday, we created user directories and 
> account in the /home area, but I realized later the disk partition 
> containing /home was too small. I moved the home directories to 
> /usr/home. I speculated that the slave was not finding the correct 
> .cshrc file so I created a temporary /home/gilfoyle
> area, put all the files in there (including the .cshrc file), and 
> tried running root on the slave from that new directory. I get the 
> same error message. 
> 
> Do you have any thoughts on what a solution could be???
> 
> I will also contact the root developers to see if they have run into 
> this problem.
> 
> thanks-in-advance,
> 
> jerry
> 
> 

-- 
-------------------------steven james, director of research, linux labs
... ........ ..... ....                     230 peachtree st nw ste 701
the original linux labs                             atlanta.ga.us 30303
      -since 1995                              http://www.linuxlabs.com
                                   office 404.577.7747 fax 404.577.7743
-----------------------------------------------------------------------