The goal of this workshop is to provide a forum to discuss new and emerging general-purpose programming architectures, environments and platforms, as well as evaluate applications that have been able to harness the horsepower provided by these platforms. This year's work is particularly interested in new heterogeneous architecture/platforms, new forms of concurrency, and novel/irregular applications that can leverage these platforms. Papers are being sought on many aspects of GPUs, including (but not limited to):

Paper Submission Instructions

Papers Due: December 5(deadline extended to December 5 now), 2017

Author Notification: January 3, 2018

Camera Ready Paper: January 15, 2018

Full paper submissions must be in PDF formatted for US lettersize paper.They must not exceed 10 pages (all inclusive) in standard ACM two-column conference format (preprint mode, with page number). Templates for ACM format are available for Microsoft Word, and LaTeX at: http://www.sigplan.org/authorInformation.html (use the 9 pt template). All accepted papers will be published in the ACM Online Conference Proceedings Series. Please refer to the submission website to submit your papers.

The registration for the conference can be done through this link.


Keynote 1

Initial Steps toward Making GPU a First-Class Computing Resource: Sharing and Resource Management
Jun Yang,University of Pittsburgh

GPUs have evolved from traditional graphics accelerators into core compute engines for a broad class of general-purpose applications. However, current commercial offerings fall short of the great potential of GPUs largely because they cannot be managed as easily as the CPU. The enormous amount of hardware resources are often greatly underutilized. We developed new architecture features to enable fine-grained sharing of GPUs, termed Simultaneous Multi-kernel (SMK), in a similar way the CPU achieves sharing via simultaneous multithreading (SMT). With SMK, different applications can co-exist in every streaming multiprocessor of a GPU, in a fully controlled way. High resource utilization can be achieved by exploiting heterogeneity of different application behaviors. Resource apportion among sharers are developed for fairness, throughput, and quality-of-services. We also envision that SMK can enable better manageability of GPUs and new features such as more efficient synchronization mechanisms within an application.

Keynote 2

Generating High Performance GPU Code using Rewrite Rules with Lift
Christophe Dubach,University of Edinburgh

Graphic processors (GPUs) are the cornerstone of modern heterogeneous systems. GPUs exhibit tremendous computational power but are notoriously hard to program. High-level programming languages and domain-specific languages have been proposed to address this issue. However, they often rely on complex analysis in the compiler or device-specific implementations to achieve maximum performance. This means that compilers and software implementations need to be re-written and re-tuned continuously as new hardware emerge. In this talk, I will present Lift, a novel high-level data-parallel programming model. The language is based on a surprisingly small set of functional primitives which can be combined to define higher-level hardware-agnostic algorithmic patterns. A system of rewrite-rules is used to derive device-specific optimized low-level implementations of the algorithmic patterns. The rules encode both algorithmic choices and low-level optimizations in a unified system and let the compiler explore the optimization space automatically. Our results show that the generated code matches the performance of highly tuned implementations of several computational kernels from linear algebra and stencil domain across various classes of GPUs.

The full program for GPGPU11 is available for here


8:00-8:30 – Breakfast Available

8:30 – Welcome: The Organizers

8:30-9:30 – Keynote 1 – Initial Steps toward Making GPU a First-Class Computing Resource: Sharing and Resource Management
Jun Yang,University of Pittsburgh

9:30-10:00-Session:Persistent Data Structures
  • A Case For Persist Barriers in GPUs : Dibakar Gope, Arkaprava Basu, Sooraj Puthoor and Mitesh Meswani, ARM Research and AMD Research

  • 10:00-10:30 – Coffee Break

    10:30-12:00 – Session:Applications/Frameworks
  • Overcoming the Difficulty of Large-scale CGH Generation on multi-GPU Cluster
    Takanobu Baba, Shinpei Watanabe, Boaz Jessie Jackin, Takeshi Ohkawa, Kanemitsu Ootsu, Takashi Yokota, Yoshio Hayasaki and Toyohiko Yatagai, Utsunomiya University and National Institute of Information and Communications Technology

  • Transparent Avoidance of Redundant Data Transfer on GPU-enabled Apache Spark
    Ryo Asai, Masao Okita, Fumihiko Ino and Kenichi Hagihara, Osaka University

  • GPU-based Acceleration of Detailed Tissue-Scale Cardiac Simulations
    Neringa Altanaite and Johannes Langguth, Simula Research Laboratory, Norway

    12:00-13:30 – Lunch (on your own)

    13:30-14:30 : Keynote 2 – Generating High Performance GPU Code using Rewrite Rules with Lift
    Christophe Dubach, University of Edinburgh

    14:30-15:00 – Coffee Break

    15:30-16:30 – Concurrent Kernels
  • MaxPair: Enhance OpenCL Concurrent Kernel Execution by Weighted Maximum Matching
    Yuan Wen, Michael O’Boyle and Christian Fensch, Trinity College Dublin, University of Edinburgh, Heriot-Watt University
  • Oversubscribed Command Queues in GPUs
    Sooraj Puthoor, Xulong Tang, Joseph Gross and Bradford Beckmann, AMD Research, Penn St. University and ARM
  • Organizers



    Program Committee

    000webhost logo