Parallel job error on Neeshub

This forum is for issues related to parallel processing
and OpenSees using the new interpreters OpenSeesSP and OpenSeesMP

Moderator: selimgunay

Post Reply
ozgura
Posts: 36
Joined: Mon Apr 19, 2010 9:46 am
Location: Virginia Tech

Parallel job error on Neeshub

Post by ozgura » Fri Mar 16, 2012 7:59 am

Hello,

I submitted a parallel job using OpenSees NEES resources (8 processors). The analyses started and worked well for about 15 minutes, then my job stopped. In *.stderr file, I found this:

mpirun noticed that process rank 1 with PID 663 on node NEEShub exited on signal 24 (CPU time limit exceeded).

I believe this occurs due to the limit on the wall time of my job request. I can't control it though as we don't submit a qsub file.

Please let me know about this problem.
Thanks,
Ozgur Atlayan
Virginia Tech

fmk
Site Admin
Posts: 5883
Joined: Fri Jun 11, 2004 2:33 pm
Location: UC Berkeley
Contact:

Re: Parallel job error on Neeshub

Post by fmk » Tue Mar 20, 2012 4:42 pm

the default is set to 4 hours. can you check if it works on the hansen resource option.

bmobashe
Posts: 11
Joined: Wed Aug 10, 2011 4:25 pm

Re: Parallel job error on Neeshub

Post by bmobashe » Thu Jun 14, 2012 11:16 am

Hi Frank,
I have the same problem. I switched to hansen resource option. But my status is always "Submited" and it does not start running. Do you have any suggestion?
I am using opensees SP so I just changed the System command to System Mumps.

Thank you so much for your help,
Bahareh


Post Reply