11th Workshop on General Purpose Processing Using GPU
Held in conjunction with PPoPP 2018
Vösendorf,Austria, 25th Februrary 2018
The goal of this workshop is to provide a forum to discuss new and emerging general-purpose programming architectures, environments and platforms, as well as evaluate applications that have been able to harness the horsepower provided by these platforms. This year's work is particularly interested in new heterogeneous architecture/platforms, new forms of concurrency, and novel/irregular applications that can leverage these platforms. Papers are being sought on many aspects of GPUs, including (but not limited to):
GPU Programming Environments
GPU Runtime Systems
GPU Power Efficiency
Paper Submission Instructions
Papers Due: December 5(deadline extended to December 5 now), 2017
Author Notification: January 3, 2018
Camera Ready Paper: January 15, 2018
Full paper submissions must be in PDF formatted for US lettersize paper.They must not exceed 10 pages (all inclusive) in standard ACM two-column conference format (preprint mode, with page number). Templates for ACM format are available for Microsoft Word, and LaTeX at:
http://www.sigplan.org/authorInformation.html (use the 9 pt template). All accepted papers will be published in the ACM Online Conference Proceedings Series. Please refer to the
submission website to submit your papers.
The registration for the conference can be done through this link.
Initial Steps toward Making GPU a First-Class Computing Resource: Sharing and Resource Management
Jun Yang,University of Pittsburgh
GPUs have evolved from traditional graphics accelerators into core compute engines for a broad class of general-purpose applications. However, current commercial offerings fall short of the great potential of GPUs largely because they cannot be managed as easily as the CPU. The enormous amount of hardware resources are often greatly underutilized. We developed new architecture features to enable fine-grained sharing of GPUs, termed Simultaneous Multi-kernel (SMK), in a similar way the CPU achieves sharing via simultaneous multithreading (SMT). With SMK, different applications can co-exist in every streaming multiprocessor of a GPU, in a fully controlled way. High resource utilization can be achieved by exploiting heterogeneity of different application behaviors. Resource apportion among sharers are developed for fairness, throughput, and quality-of-services. We also envision that SMK can enable better manageability of GPUs and new features such as more efficient synchronization mechanisms within an application.
Generating High Performance GPU Code using Rewrite Rules with Lift
Christophe Dubach,University of Edinburgh
Graphic processors (GPUs) are the cornerstone of modern heterogeneous systems. GPUs exhibit tremendous computational power but are notoriously hard to program. High-level programming languages and domain-specific languages have been proposed to address this issue. However, they often rely on complex analysis in the compiler or device-specific implementations to achieve maximum performance. This means that compilers and software implementations need to be re-written and re-tuned continuously as new hardware emerge.
In this talk, I will present Lift, a novel high-level data-parallel programming model. The language is based on a surprisingly small set of functional primitives which can be combined to define higher-level hardware-agnostic algorithmic patterns. A system of rewrite-rules is used to derive device-specific optimized low-level implementations of the algorithmic patterns. The rules encode both algorithmic choices and low-level optimizations in a unified system and let the compiler explore the optimization space automatically. Our results show that the generated code matches the performance of highly tuned implementations of several computational kernels from linear algebra and stencil domain across various classes of GPUs.
The full program for GPGPU11 is available for here
8:00-8:30 – Breakfast Available
8:30 – Welcome: The Organizers
8:30-9:30 – Keynote 1 – Initial Steps toward Making GPU a First-Class Computing Resource: Sharing and Resource Management
Jun Yang,University of Pittsburgh
9:30-10:00-Session:Persistent Data Structures
A Case For Persist Barriers in GPUs : Dibakar Gope, Arkaprava Basu, Sooraj Puthoor and Mitesh Meswani, ARM Research and AMD Research
10:00-10:30 – Coffee Break
10:30-12:00 – Session:Applications/Frameworks
Overcoming the Difficulty of Large-scale CGH Generation on multi-GPU Cluster Takanobu Baba, Shinpei Watanabe, Boaz Jessie Jackin, Takeshi Ohkawa, Kanemitsu Ootsu, Takashi Yokota, Yoshio Hayasaki and Toyohiko Yatagai, Utsunomiya University and National Institute of Information and Communications Technology
Transparent Avoidance of Redundant Data Transfer on GPU-enabled Apache Spark Ryo Asai, Masao Okita, Fumihiko Ino and Kenichi Hagihara, Osaka University
GPU-based Acceleration of Detailed Tissue-Scale Cardiac Simulations Neringa Altanaite and Johannes Langguth, Simula Research Laboratory, Norway
12:00-13:30 – Lunch (on your own)
13:30-14:30 : Keynote 2 – Generating High Performance GPU Code using Rewrite Rules with Lift Christophe Dubach, University of Edinburgh
14:30-15:00 – Coffee Break
15:30-16:30 – Concurrent Kernels
MaxPair: Enhance OpenCL Concurrent Kernel Execution by Weighted Maximum Matching Yuan Wen, Michael O’Boyle and Christian Fensch, Trinity College Dublin, University of Edinburgh, Heriot-Watt University
Oversubscribed Command Queues in GPUs Sooraj Puthoor, Xulong Tang, Joseph Gross and Bradford Beckmann, AMD Research, Penn St. University and ARM