Link errors trying to build OpenSees on

This forum is for issues related to parallel processing
and OpenSees using the new interpreters OpenSeesSP and OpenSeesMP

Moderator: selimgunay

Post Reply
suhovecky
Posts: 9
Joined: Mon Dec 06, 2010 5:52 am
Location: University of Notre Dame

Link errors trying to build OpenSees on

Post by suhovecky » Tue Jan 04, 2011 12:49 pm

Hi-

I am trying to build OpenSees with MPI for use on on Linux system with SGE parallel job submission:

The OS is Red Hat Linux Enterprise 5.

I'm using the Intel Compiler Suite, version 11.1

Our MPI package is mvapich2 version 1.6

TCL & TK 8.4 librariries are available in /usr/lib64

I chose the the RANGER Makedef file, which met most of my criteria, and modified it. the modification is here:

http://www.nd.edu/~msuhovec/files/Makefile.def

The libraries build, but I get link errors:

http://www.nd.edu/~msuhovec/files/makekerrs.txt

There's two groups of them- the frist set want "Machinebrokers" & "The Channels"; the second (dmavtec, dlsolve)
are maybe missing or mismatched numerical libararies.

I am trying to link to a locally -build version of the MKL libraries for Intel, but I'm willling to give that up for now if I can get it built.

Any help is approeciated.

Thanks,

Mark
Mark Suhovecky
suhovecky@nd.edu

suhovecky
Posts: 9
Joined: Mon Dec 06, 2010 5:52 am
Location: University of Notre Dame

Re: Link errors trying to build OpenSees on

Post by suhovecky » Wed Jan 05, 2011 11:41 am

OK, my first set of errors all come from
these lines in SRC/tcl/command.cpp:

#ifdef _PARALLEL_INTERPRETERS
#include <MachineBroker.h>
extern MachineBroker *theMachineBroker;
extern Channel **theChannels;
extern int numChannels;
extern int rank;
extern int np;
#endif

There are member functions in MachineBroker to return channel, rank and NP- not sure why it does this.

I'd be interested in knowing if anyone else building OpenSees in _PARALLEL_INTERPRETERS mode got past these errors- looks like one of the posters right before me had the same problem.

Mark
Mark Suhovecky
suhovecky@nd.edu

fmk
Site Admin
Posts: 5866
Joined: Fri Jun 11, 2004 2:33 pm
Location: UC Berkeley
Contact:

Re: Link errors trying to build OpenSees on

Post by fmk » Wed Jan 05, 2011 2:10 pm

can you obtain the latest code using svn and try that .. i just updated the code and compiled it on a linux machine with the intel 10.1 compiler.

suhovecky
Posts: 9
Joined: Mon Dec 06, 2010 5:52 am
Location: University of Notre Dame

Re: Link errors trying to build OpenSees on

Post by suhovecky » Wed Jan 05, 2011 6:47 pm

Frank-

Thanks, I will try that. Do I need to download everything under SRC to pick up all the changes
since the 2.2.2 Release? (If this is a silly question, it's because I'm still a CVS guy, and my subversion
is pretty minimal :) )

Mark
Mark Suhovecky
suhovecky@nd.edu

suhovecky
Posts: 9
Joined: Mon Dec 06, 2010 5:52 am
Location: University of Notre Dame

Re: Link errors trying to build OpenSees on

Post by suhovecky » Thu Jan 06, 2011 1:29 pm

I am able to build an OpenSeeMP executable using revision 4399 of the code. The interpreter, however, generates a floating point exeception
when I try executing it.
Mark Suhovecky
suhovecky@nd.edu

fmk
Site Admin
Posts: 5866
Joined: Fri Jun 11, 2004 2:33 pm
Location: UC Berkeley
Contact:

Re: Link errors trying to build OpenSees on

Post by fmk » Fri Jan 07, 2011 10:06 am

is it when it starts running the scipt or before hand .. put some puts commands in the script to see where.

suhovecky
Posts: 9
Joined: Mon Dec 06, 2010 5:52 am
Location: University of Notre Dame

Re: Link errors trying to build OpenSees on

Post by suhovecky » Tue Jan 11, 2011 1:53 pm

Well, it appears to be MPI-related.

This machine has infinband network cards on it . If I use mvapich2 as my MPI package,
OpenSessMP fails without ever getting to a shell prompt. (mvapich2 on my system is
configured to use Infniband).

With the lastest version of mavpich2 (1.6), all i get is a floating point exception.

If I try it with version 1.4, it sits fro 30 seconds, then gives me these errors:

]$ ./OpenSeesMP
dqcneh033.crc.nd.edu.19268ipath_wait_for_device: The /dev/ipath device failed to appear after 30.0 seconds: Connection timed out
dqcneh033.crc.nd.edu.19268PSM Could not find an InfiniPath Unit on device /dev/ipath (30s elapsed) (err=21)
psm_ep_open failed with error PSM Could not find an InfiniPath Unit
Fatal error in MPI_Init: Internal MPI error!, error stack:
MPIR_Init_thread(311): Initialization failed
MPID_Init(191).......: channel initialization failed
(unknown)(): Internal MPI error![

I decide to try building it without infinband, and used openMPI - this works fine. I can get a OpenSees shell prompt, and source in and run
a command.

I need to run down the above error to figure out what's going on.

Mark
Mark Suhovecky
suhovecky@nd.edu

suhovecky
Posts: 9
Joined: Mon Dec 06, 2010 5:52 am
Location: University of Notre Dame

Re: Link errors trying to build OpenSees on

Post by suhovecky » Thu Jan 20, 2011 12:48 pm

After quite a bit of hair pulling, in appears that out problems with infiniband are in the infiniband drivers,
and not the OpenSees software. I'll be able to verify this after we reload the infiniband stuff this weekend.

Thanks for all the help,

Mark
Mark Suhovecky
suhovecky@nd.edu

Hannek
Posts: 2
Joined: Wed Oct 19, 2011 3:15 am

Re: Link errors trying to build OpenSees on

Post by Hannek » Sun Oct 23, 2011 10:21 pm

suhovecky wrote:
> After quite a bit of hair pulling, in appears that out problems with
> infiniband are in the infiniband drivers,
> and not the OpenSees software. I'll be able to verify this after we reload
> the infiniband stuff this weekend.
>
> Thanks for all the help,
>
> Mark
Hi,
i wonder what's the verdict? It's a bit complicated issue to deal with.

Kevinstan
Posts: 2
Joined: Tue Jan 03, 2012 9:52 pm

Re: Link errors trying to build OpenSees on

Post by Kevinstan » Wed Jan 04, 2012 2:35 am

I am able to build an OpenSeeMP executable using revision 4399 of the code. The interpreter, however, generates a floating point execption is it when it starts running the scipt or before hand..

Post Reply