Computations to GPUs Graphics processors GPUs and other wide SIMD multiprocessors are becoming a dominant force in highperformance computing How do we effectively use them for streaming applications ID: 360983
Download Presentation The PPT/PDF document "Mapping Irregular" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Mapping Irregular Computations to GPUs
Graphics processors (GPUs) and other “wide SIMD” multiprocessors are becoming a dominant force in high-performance computing.How do we effectively use them for streaming applications that do not “fit” their very regimented style of parallelism?sensor integrationmachine learningbioinformatics (our focus)These computations are organized as multi-stage cascades or tree traversals. We have developed dynamic mapping strategies to parallelize and manage such computations entirely on the GPU with low overhead.Application to short DNA read mapping, a key task in bioinformatics, yields an efficient implementation.equivalent to 10+ fast CPU cores running BWA (widely used software for DNA read mapping problem)2x as fast as “naïve” GPU code without our improvementsOur work opens the door to more advanced remapping techniques, using polyhedral analysis, to automatically find efficient SIMD mappings of streaming applications.
S. Cole, J. Gardner, and J. Buhler. “WOODSTOCC: Extracting Latent Parallelism from a DNA Sequence Aligner on a GPU.” Proc. 13th IEEE Int. Symp. Parallel and Distributed Computing, Aix-Marseille, France, 2014.
…
CGA
CCATCGT
CCGATCAGTGCGCTACAGCTACA …
CCATCGT
ACATCT
TCAGT
Short DNA read mapping identifies approximate matches
to experimentally derived DNA strings in a large genome.
Sketch of our mapping implementation. The genome is indexed as a v
irtual search
trie
. GPU-based search explores this
trie
, incrementally comparing sets of DNA reads to
trie
nodes in parallel using dynamic programming. Read sets are managed using parallel
worklist
primitives.