Dear all,
currently I am working on the parallelization of a Monte Carlo code for particle transport written in FORTRAN 77 using OpenMP and the Intel Fortran Compiler (version 2017). I have successfully implemented the parallelization and on several tests in general I obtain a better performance (up to 30% faster) than the original parallelization scheme of the code, which relies of running several processes in parallel using a Batch Queuing System.
However, my problem is when the particles information (energy, position, direction) is read from a file (each line of this file contains the information of one particle, and therefore several read operations are done during the simulation) the performance of the code using OpenMP drops severally and then it is similar to the performance obtained with the original parallel implementation. Therefore I am essentially losing the original advantage of the OpenMP implementation when a file is used as an input for the simulation.
I really do not know if I am using the best approach and therefore I would like to ask you for advice. My approach is that each thread opens the file with its own unit number, as shown in the following snippet:
C$OMP0PARALLEL DEFAULT(SHARED) omp_iam = OMP_GET_THREAD_NUM() omp_size = OMP_GET_NUM_THREADS() nnphsp = INT(omp_iam*nshist/omp_size)+1 UNIT_PHSP = 44 + omp_iam OPEN(UNIT=UNIT_PHSP,FILE=NAME_PHSP,FORM='UNFORMATTED',ACCESS=' *DIRECT', RECL=PHSP_RECL,STATUS='OLD',IOSTAT=IERR_PHSP) C$OMP0END PARALLEL
The file containing the particle information is divided among the threads through the nnpsh variable. Therefore none of the threads reads the same line during the simulation. Then, in order to read a line the following code is used:
READ(UNIT_PHSP,REC=nnphsp,IOSTAT=IERR_PHSP) latchi,ESHOR *T,X_PHSP_SHORT,Y_PHSP_SHORT, U_PHSP_SHORT,V_PHSP_SHORT,W *T_PHSP_SHORT
I would like to know if there is a better approach to the above mentioned. A distinctive characteristic of this code is that due to the stochastic nature of the Monte Carlo simulation the threads do not access the file at the same time or in an ordered way. Each time that a new particle is going to be simulated its information is read from the file. Thanks for your help!