Author Topic: rdgrad crashes in parallel  (Read 5256 times)

jan

  • Jr. Member
  • **
  • Posts: 11
  • Karma: +0/-0
rdgrad crashes in parallel
« on: September 01, 2009, 10:52:00 am »
Sorry for posting the second question i a row but I have encountered serious problems running a geometry optimization in parallel mode.
I think the problem is with the rdgrad module because when I start a calculation in parallel I crashes with the following message in the slave1.output:

  this is node-proc. number 1 running on node
 bx10                                                                           
 
  the total number of node-proc. spawned is            7
  parallel platform: SMP or SMP like systems
  call pa_barr(90)

 data group $actual step is not empty
 due to the abend of ridft


 check reason for abend ...

 use the command  'actual -r'  to get rid of that


  CONTRL dead = actual step
 rdgrad ended abnormally


The calculation start just fine an the first ridft-step is performed in parallel without error (I check the running task on the node just after submitting the job)
but then it crashes with the message above. Strange is that I have running calculation with with 3 or 4 cpu-cores in use which seem to run just fine. I the above case
i used 7 cpu-cores.
Anyone know where my problem is?

Regards Jan

jan

  • Jr. Member
  • **
  • Posts: 11
  • Karma: +0/-0
Re: rdgrad crashes in parallel
« Reply #1 on: September 03, 2009, 09:45:53 am »
in addition:
I test the same job above with 4 cores (3 +1 server task)  and there was no error. But when I select more than 4 it alway crashes
with the message above.
Regards Jan