Quantcast
Channel: Recent posts
Viewing all articles
Browse latest Browse all 190

Performance of offloaded MKL FFTs on the MIC, anyone?

$
0
0

My initial experiments offloading MKL FFTs into the MIC (using C language in Linux) give me approximately 9.3 GFLOPS of performance, judging by the reported [MIC Time] numbers when I set the environment variable  OFFLOAD_REPORT to 1 (or 2). This is about 0.46% of the advertized peak performance of 2 TFLOPS. But in fact, it is much less than that if I take into account the time for the data movement inside the offload section [CPU Time in the "report").

Am I missing something?

I am curious to know if my numbers are way off or consistent with other benchmarks (I could not find any).

I would appreciate it if someone could point me to related information or to know if someone had a different (or similar) experience.

The bottom line is that I hope I need to do something to drastically improve its performance, but I ran out of ideas. Any help will be appreciated.

Thanks!

Fernando

 


Viewing all articles
Browse latest Browse all 190

Trending Articles