Publications (Habanero, 2007-present)






  • Topkapi: Parallel and Fast Sketches for Finding Top-K Frequent Elements.  Ankush Mandal, He Jiang, Anshumali Shrivastava, Vivek Sarkar.  Advances in Neural Information Processing Systems 31 (NeurIPS), December 2018.
  • Detecting MPI usage anomalies via partial program symbolic execution.  Fangke Ye, Jisheng Zhao, Vivek Sarkar.  The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), November 2018.
  • A Preliminary Study of Compiler Transformations for Graph Applications on the Emu System.  Prasanth Chatarasi, Vivek Sarkar.  Proceedings of the Workshop on Memory Centric High Performance Computing (MCHPC, co-located with SC18), November 2018.
  • A Unified Runtime for PGAS and Event-Driven Programming.  Sri Raj Paul, Kun Chen, Akihiro Hayashi, Max Grossman, Vivek Sarkar.  Fourth International IEEE Workshop on Extreme Scale Programming Models and Middleware (ESPM2, co-located with SC18), November 2018.
  • Cost-driven thread coarsening for GPU kernels.  Prithayan Barua, Jun Shirako, Vivek Sarkar. 27th International Conference on Parallel Architectures and Compilation Techniques (PACT), November 2018.
  • In-Register Parameter Caching for Dynamic Neural Nets with Virtual Persistent Processor Specialization.  Farzad Khorasani, Hodjat Asghari Esfeden, Nael Abu-Ghazaleh, Vivek Sarkar.  The 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2018.
  • Using Dynamic Compilation to Achieve Ninja Performance for CNN Training on Many-Core Processors.  Ankush Mandal, Raj Barik, Vivek Sarkar.  25th International European Conference on Parallel and Distributed Computing (Euro-Par), August 2018.
  • GT-Race: graph traversal based data race detection for asynchronous many-task parallelism.  Lechen Yu, Vivek Sarkar.  25th International European Conference on Parallel and Distributed Computing (Euro-Par), August 2018.
  • Implementation and Evaluation of OpenSHMEM Contexts Using OFI Libfabric.  Workshop on  OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, August 2018.
  • RegMutex: Inter-Warp GPU Register Time-Sharing.  Farzad Khorasani, Hodjat Asghari Esfeden, Amin Farmahini-Farahani, Nuwan Jayasena, Vivek Sarkar.  2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), June 2018.
  • Porting DMRG++ Scientific Application to OpenPOWER.  Arghya Chatterjee,  Gonzalo Alvarez,  E. D’Azevedo, Wael Elwasif, Oscar Hernandez, Vivek Sarkar, International Workshop on OpenPOWER for HPC (IWOPH, co-located with ISC’18), June 2018.
  • Parallel Sparse Flow-Sensitive Points-to Analysis. Jisheng Zhao, Michael G. Burke, Vivek Sarkar. Proceedings of the 2018 International Conference on Compiler Construction (CC 2018), February 2018.
  • Modeling the Conflicting Demands of Multi-Level Parallelism and Temporal/Spatial Locality in Affine Scheduling. Oleksandr Zinenko, Chandan Reddy, Sven Verdoolaege, Jun Shirako, Tobias Grosser, Vivek Sarkar, Albert Cohen. Proceedings of the 2018 International Conference on Compiler Construction (CC 2018), February 2018.
  • PIPES: A Language and Compiler for Task-Based Programming on Distributed-Memory Clusters. Martin Kong, Louis-Noël Pouchet, P. Sadayappan, Vivek Sarkar.  The Conference on High Performance Computing, Networking, Storage and Analysis (SC16), November 2016.
  • Static Cost Estimation for Data Layout Selection on GPUs. Yuhan Peng, Max Grossman, Vivek Sarkar. 7th International Workshop in Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS16, co-located with SC16). November 2016.
  • Fine-grained parallelism in probabilistic parsing with Habanero Java. Matthew Francis-Landau (Johns Hopkins University), Bing Xue (Rice University), Jason Eisner (Johns Hopkins University), and Vivek Sarkar (Rice University). In Proceedings of the Sixth Workshop on Irregular Applications: Architectures and Algorithms (IA3, co-located with SC16), November 2016 [slides]. 
  • Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator Model on a POWER8+GPU PlatformAkihiro Hayashi, Jun Shirako, Ettore Tiotto, Robert Ho, Vivek Sarkar. Third Workshop on Accelerator Programming Using Directives (WACCPD, co-located with SC16), November 2016.
  • Optimized Distributed Work-Stealing. Vivek Kumar, Karthik Murthy, Vivek Sarkar and Yili Zheng. 6th workshop on Irregular Applications: Architectures and Algorithms (IA^3), ACM, November 2016 [slides].
  • Automatic Parallelization of Pure Method Calls via Conditional Future Synthesis. Rishi Surendran and Vivek Sarkar. 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016), November 2016.
  • Pedagogy and Tools for Teaching Parallel Computing at the Sophomore Undergraduate Level. Max Grossman, Maha Aziz, Heng Chi, Anant Tibrewal, Shams Imam, Vivek Sarkar. Journal of Parallel and Distributed Computing Special Issue on Parallel, Distributed, and High Performance Computing Education. 2016. 
2015 2014   2013 2012 2011 2010 2009 2008 2007 Acknowledgment This material is based upon work supported by the National Science Foundation under Grants No. 0833166, 0938018, 0926127, 0964520, 1302570. Anyopinions,findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).