Quantcast
Channel: Intel® Software - Intel® Fortran Compiler for Linux* and macOS*
Viewing all articles
Browse latest Browse all 2746

Distributed Coarray Fortran: misunderstanding/bug?

$
0
0

Dear,

Below is a Coarray Fortran program that gives me some troubles:

A large vector (x) is updated in two different ways. For large sizes of x, the updates of x is wrong WHEN each image is on a different node. WHEN all images are on the same node, results are always fine, whatever the size of x.

When size(x)=10^6, exchanging the full array across images on different nodes led to wrong results. However, exchanging small subsets of x led to correct results.

When size(x)>2*10^7, exchanging the full array across images on different nodes led to wrong results, AND exchanging subsets of x  (size(subset) > 6*10^6) led to wrong results too.

My troubles seem to be linked to the size of the array that is exchanged across images on different nodes. So, am I doing something wrong? Could it be a bug?

I use ifort 17.0.0 with -coarray=distributed.

Here is the program that mimicks the problem (it may be stupid, with too many sync all, .... , but it is to replicate my issue):

program testcoarray
 implicit none
 integer(kind=4)::i,j,k,neq
 integer(kind=4)::startrow[*],endrow[*]
 real(kind=8)::val[*]
 real(kind=8),allocatable::x(:)[:]

 neq=1000000
 neq=25806732

 if(this_image().eq.1)then
  write(*,'(/a,i0)')' Size of the array: ',neq
  write(*,'(a,i0/)')' Number of images : ',num_images()
 endif

 !INITIALISATION
 i=neq/num_images()
 startrow=(this_image()-1)*i+1
 endrow=this_image()*i
 if(this_image().eq.num_images())endrow=neq

 allocate(x(neq)[*])

 sync all

 !FIRST UPDATE
 x=0.d0
 x(startrow:endrow)=real(this_image(),8)

 sync all

 if(this_image().eq.1)then
  do i=2,num_images()
   x=x+x(:)[i]
  enddo
  write(*,*)' First update : ',sum(x)
 endif

 sync all

 !SECOND UPDATE
 x=0.d0
 x(startrow:endrow)=real(this_image(),8)

 sync all

 if(this_image().eq.1)then
  do i=2,num_images()
   j=startrow[i]
   k=endrow[i]
   x(j:k)=x(j:k)+x(j:k)[i]
  enddo
  write(*,*)' Second update: ',sum(x)
 endif

 sync all

 !CORRECT ANSWER
 x=0.d0
 x(startrow:endrow)=real(this_image(),8)
 val=sum(x)

 sync all

 if(this_image().eq.1)then
  do i=2,num_images()
   val=val+val[i]
  enddo
  write(*,*)' Correct value: ',val
 endif

 sync all

end program

 

And here are the output for neq=1000000

*With all images on the same node:

 Size of the array: 1000000
 Number of images : 4

  First update :    2500000.00000000     
  Second update:    2500000.00000000     
  Correct value:    2500000.00000000

*With each image on a different node:

 Size of the array: 1000000
 Number of images : 4

  First update :    750000.000000000     
  Second update:    2500000.00000000     
  Correct value:    2500000.00000000 

And here are the output for neq=25806732

*With all images on the same node:

 Size of the array: 25806732
 Number of images : 4

  First update :    64516830.0000000     
  Second update:    64516830.0000000     
  Correct value:    64516830.0000000 

*With each image on a different node:

 Size of the array: 25806732
 Number of images : 4

  First update :    19355049.0000000     
  Second update:    6451727.00000000     
  Correct value:    64516830.0000000     

 

In advance thank you for your help.

 

Jeremie

 

 

 

 

 


Viewing all articles
Browse latest Browse all 2746

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>