PPT-OpenACC

Author : jane-oiler | Published Date : 2017-06-30

cache Directive Opportunities and Optimizations Ahmad Lashgar amp Amirali Baniasadi ECE Department University of Victoria November 14 2016 1 Directivebased

Presentation Embed Code

Download Presentation

Download Presentation The PPT/PDF document "OpenACC" is the property of its rightful owner. Permission is granted to download and print the materials on this website for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

OpenACC: Transcript


cache Directive Opportunities and Optimizations Ahmad Lashgar amp Amirali Baniasadi ECE Department University of Victoria November 14 2016 1 Directivebased Accelerator Programming Models. CUDA Platform. CUDA Parallel Computing Platform. . Hardware . . . Capabilities. GPUDirect. SMX. Dynamic Parallelism. HyperQ. Programming . Approaches. Libraries. “Drop-in” Acceleration. Co-Design. Virginia Tech. AFOSR-BRI Workshop. July . 20-21, . 2014. Keyur Joshi. , Long He . & Danesh Tafti. Collaborators. Xuewen. Cui, . Hao. Wang, Wu-. chun. . Feng &. Eric de . Sturler. OpenACC. Directives. subroutine . saxpy. (n, a, x, y) . real :: x(:), y(:), a. integer :: n, . i. $!. acc. kernels. do . i. =1,n.   y(. i. ) = a*x(. i. )+y(. i. ). . enddo. $!. acc. end kernels. openacc. Ebad. . Salehi. , Ahmad . Lashgar. and . Amirali. . Baniasadi. Electrical and Computer Engineering Department. University of Victoria. This Work. Motivation: Effective memory bandwidth usage is needed to achieve high performance in GPUs.. Quentin Ochem. October 4. th. , 2018. What is GPGPU?. GPU were traditionally dedicated to graphical rendering …. … but their capability is really vectorized computation. Enters General Programming GPU (GPGPU). Mark Taylor. mataylo@sandia.gov. E3SM Machine Roadmap . CPUs: . BER purchased dedicated systems. Anvil, . CompyMcNodeFace. : 6.2M node-hours per year. Great machines for moderate resolutions – fastest performance. Performance portability: An implementer’s Perspective. Versatility of HPE Cray Programming Environment (CPE). Platforms from HPE/Cray. Supported by CPE. 2. AMD CPUs and NVIDIA GPUs. AMD CPUs and AMD GPUs.

Download Document

Here is the link to download the presentation.
"OpenACC"The content belongs to its owner. You may download and print it for personal use, without modification, and keep all copyright notices. By downloading, you agree to these terms.

Related Documents