The purpose of this example is to find out how a program behaves when both the upper layer (Python) and the lower layer (Fortran) use their threading capabilities.
The Python3 caller script launches 2 concurrent instances of the same Fortran routine using the Thread class.
The Fortran callee is OpenMP parallel with OMP_NUM_THREADS set to 4.
Looking at the printout it looks like the 2 concurrent OpenMP instances don't interfere with each other, both have their threads numbered from 0 to 3. This is nice as it allows to put a level of parallelism on top of OpenMP. But the question is: Is such nested parallelism a supported feature or is this just an accidental observation ?
-------------------- Output printed --------------------
>python3 pydot.py Thread 2 Thread 3 Thread 0 Thread 1 Thread 0 Thread 1 1000.0 Thread 2 Thread 3 500.0
-------------------- Fortran code ----------------------
! Compile to shared library doing ! ifort -g -fPIC -shared -openmp -o ddot.so ddot.f90 real(8) function ddot(n,a,b) bind(c) use, intrinsic :: iso_c_binding use omp_lib integer(4), value, intent(in) :: n real(8), dimension(1:n), intent(in) :: a real(8), dimension(1:n), intent(in) :: b integer :: i ddot = 0.0 !$OMP PARALLEL print *,'Thread', omp_get_thread_num() !$OMP DO REDUCTION(+:ddot) do i = 1, n ddot = ddot + a(i) * b(i) end do !$OMP END PARALLEL end function
------------------ Python code ----------------------
from ctypes import * from threading import Thread # Load shared library flib = cdll.LoadLibrary('./ddot.so') # Prototype for external function veclen = 1000 flib.ddot.argtypes = [c_int, ARRAY(c_double, veclen), ARRAY(c_double, veclen)] flib.ddot.restype = c_double # Array construction rvect = c_double * veclen a = rvect() b = rvect() c = rvect() for i in range(0,veclen): a[i] = 0.5 b[i] = 2.0 c[i] = 1.0 # Python wrapper def pydot(a,b): pydot.argtypes = [ARRAY(c_double, veclen), ARRAY(c_double, veclen)] res = flib.ddot(c_int(veclen), a, b) print(res) # Do it using 2 threads # Create t1 = Thread(target=pydot, args=(a,b,)) t2 = Thread(target=pydot, args=(a,c,)) # Launch t1.start() t2.start() # Wait for completion t1.join() t2.join() # End