race conditions

This forum is for issues related to parallel processing
and OpenSees using the new interpreters OpenSeesSP and OpenSeesMP

Moderator: selimgunay

Post Reply
brag006
Posts: 173
Joined: Wed Feb 15, 2012 1:26 pm
Location: University of Auckland

race conditions

Post by brag006 » Mon Mar 03, 2014 2:01 pm

I am running a medium size model through approximately 100 eq motions and using OpenSeesMP to run multiple processes. I used code similar to the example provided with the source code. After many days of having random errors related to memory (glibc errors) it appears I have a race condition. I read something related to mutex which can be used to lock code while it is being executed. Can you please help me with how to implement that in OpenSees?

fmk
Site Admin
Posts: 5883
Joined: Fri Jun 11, 2004 2:33 pm
Location: UC Berkeley
Contact:

Re: race conditions

Post by fmk » Mon Mar 03, 2014 4:43 pm

if you are having memory problems that appear only after many analysis have completed succesffully it could be due to memory leaks in the program. If so a better solution might be to exec OpenSees jobs to
run each simulation is it's own process space.

if indeed a race condition and you want to use a mutex, the only way to use mutexs with Opensees is to use the file-system and the fact that it will only allow one process to open a file to write at any one time. throw whatever is needed in the file or do whatever is needed in the mutex while the process has the file open.

brag006
Posts: 173
Joined: Wed Feb 15, 2012 1:26 pm
Location: University of Auckland

Re: race conditions

Post by brag006 » Tue Mar 04, 2014 3:34 pm

I only get memory corruption problems when I use mumps. I changed to umfpack and no longer memory problems. Any thoughts on why could that be? I tried using the -intcl14 option but no luck.

fmk
Site Admin
Posts: 5883
Joined: Fri Jun 11, 2004 2:33 pm
Location: UC Berkeley
Contact:

Re: race conditions

Post by fmk » Fri Mar 07, 2014 9:07 am

sounds like a bug in Mumps that is memory related .. these are typically nasty nasty things to track down .. i will need the example .. are you running this on a windows or a linux machine

Post Reply