Hi all, I'm using mpiifort to compile a set of Fortran scripts that make a few Lapack and Blas calls. I've been able to use automatic offloading for the ZGETRF Lapack routine, which is LU factorization by increasing my problem size so that the matrices the ZGETRF call is processing are, in fact, greater than 8192 x 8192. However, there are some other Lapack and Blas routines not supported for automatic offloading in the scripts as well. I'm wondering if also denoting some of those routines for offloading will be worth it, because explicit offloading for me in the past has only increased computational time.
It's worth noting I'm using mpiifort to keep the MPICH2 calls inside the code intact. If that is the source of the slowdown, let me know.
Thanks.