We present a high performance hybrid CPU-GPU implementation that accelerates the Fourier-based stitching of 2D optical microscopy images to less than 1-minute (end-to-end execution times). This implementation takes advantage of coarse-grained parallelism and organizes the computation into a pipeline architecture that spans available CPU and GPU resources and that overlaps computation and data movement. The implementation achieves a nearly 10x performance improvement over a simple approach to GPU-based acceleration. It stitches a 59 x 42 grid of images in 43 s. It also scales up with available GPUs and processes the same workload in 26 s on a system with two GPUs. For comparison purposes, an optimized single-threaded reference implementation takes nearly 10 min for the same workload while ImageJ/Fiji exceeds 3.5 hours!
Proceedings Title: Sixth Workshop on General Purpose Processing Using GPUs
Conference Dates: March 16-20, 2013
Conference Location: Houston, TX
Pub Type: Conferences
GPU Computing, Parallel Computing, Image Stitching, Optical Microscopy