Hello, my name Is Hariharan

I am a I/O Researcher

I am a Computer Scientist at Center for Applied Scientific Computing (CASC) in Lawrence Livermore National Laboratory (LLNL). I work on uncovering and solving the portability and performance bottlenecks in I/O for multi-stage workflows in large-scale supercomputer. I got my PhD in Computer Science at Illinois Tech (Formerly known as Illinois Institute of Technology). My research interests are in scalable scientific data management. More specifically, I am interested in parallel I/O, data management systems for managing scientific data, and heterogeneous computing. I am also interested in convergence of Big Data and HPC storage systems.

Fields of Interest

Research

Scientific I/O research

I have deep understanding of various HPC and Cloud storage systems. I have architect several storage platforms such as Chronolog, LABIOS, Hermes, etch.

Development

HPC tools and software

I have built several HPC software such as Hermes Container Library, Hierarchical Prefetching and Compression Software, Intelligent Compression framework, DLIO Benchmark, etc. 

Projects

Current Active Research Projects

IOPP

Goal of this project is to improve I/O performance by systematically understanding and optimizing large scale workflows executing on supercomputers.

IMAI

The goal of this project is accelerate I/O of scientific AI applications using DL frameworks such as LBANN, TensorFlow, and PyTorch on supercomputers.

FRACTALE

The goal of the project perform multi-cluster scheduling. My personal interests in the project are to explore the role I/O intents in smart scheduling of hardware and software accelarators in HPC.

Code repositories

Github

Bitbucket

Publications

- Yiheng Xu, Pranav Sivaraman, Hariharan Devarajan Kathryn Mohror, and Abhinav Bhatele. "ML-based Modeling to Predict I/O Performance on Different Storage Sub-systems." In 2024, 31st edition of the IEEE International Conference on High Performance Computing, Data, and Analytics. IEEE, 2024.
- Hariharan Devarajan , Adam Moody, Donglai Dai, Cameron Stanavige, Elsa Gonsiorowski, Marty McFadden, Olaf Faaland, Greg Kosinovsky, and Kathryn Mohror. "The impact of asynchronous I/O in checkpoint-restart workloads." In 2024, 5th Workshop on Extreme-Scale Storage and Analysis (ESSA). IEEE, 2024.
- Ian Lumsden, Hariharan Devarajan , Jack Marquez, Stephanie Brink, David Boehme, Olga Pearce, Jae-Seung Yeom, and Michela Taufer. "Empirical Study of Molecular Dynamics Workflow Data Movement: DYAD vs. Traditional I/O Systems." In 2024, 23rd IEEE International Workshop on High Performance Computational Biology (HICOMB). IEEE, 2024.
- Hariharan Devarajan , Loic Pottier, Kaushik Velusamy, Huihuo Zheng, Izzet Yildirim, Olga Kogiou, Weikuan Yu, Anthony Kougkas, Xian-He Sun, Jae Seung Yeom, and Kathryn Mohror. “DFTracer: An Analysis-Friendly Data Flow Tracer for AI-Driven Workflows,” in SC24: International Conference for High Performance Computing, Networking, Storage and Analysis. Atlanta, GA: IEEE, Dec. 2024.
- Hariharan Devarajan and Kathryn Mohror. “Mimir: Extending I/O Interfaces to Express User Intent for Complex Workloads in HPC.” 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS'23) St. Petersburg, Florida USA: iEEE, May 2023.

- Hariharan Devarajan and Kathryn Mohror. "Extracting and characterizing I/O behavior of HPC workloads". The 2022 IEEE International Conference on Cluster Computing (CLUSTER'22), September 6-9, 2022, Heidelberg, Germany.

- Hariharan Devarajan, Anthony Kougkas, Huihuo Zheng, Venkatram Vishwanath, and Xian-He Sun, "Stimulus: Accelerate Data Management for Scientific AI applications in HPC," In the proceedings of the 2022 IEEE/ACM International Symposium in Cluster, Cloud, and Internet Computing (CCGrid'22), Taormina, Italy, May 16-19, 2022.

- Hariharan Devarajan, Huihuo Zheng, Anthony Kougkas, Xian-He Sun, and Venkatram Vishwanath. "DLIO: A Data-Centric Benchmark for Scientific Deep Learning Applications". In 2021 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). IEEE. Best Paper Award

- Jaime Cernuda, Hariharan Devarajan, Luke Logan, Neeraj Rajesh, Jie Ye, Anthony Kougkas, X.-H. Sun,
“HFlow: A Dynamic and Elastic Multi-Layered Data Forwarder”, The 2021 IEEE International Conference on Cluster Computing (CLUSTER'2021), September 7-10, 2021, Virtual Meeting, pp. 114-124, DOI: 10.1109/Cluster48925.2021.00064.

- Neeraj Rajesh, Hariharan Devarajan, Jaime Cernuda Garcia, Keith Bateman, Luke Logan, Jie Ye, Anthony Kougkas, and Xian-He Sun. 2021. "Apollo: An ML-assisted Real-Time Storage Resource Observer". In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '21). Association for Computing Machinery, New York, NY, USA, 147–159. DOI:https://doi.org/10.1145/3431379.3460640

- Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun. HReplica: A Dynamic Data Replication Engine with Adaptive Compression for Multi-Tiered Storage." 2020 IEEE International Conference on Big Data (Big Data), Atlanta, Georgia, USA, 2020.

- Hariharan Devarajan, Anthony Kougkas, Keith Bateman, and Xian-He Sun. "HCL: Distributing Parallel Data Structures in Extreme Scales." In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2020.

- Anthony Kougkas, Hariharan Devarajan, Keith Bateman, Jaime Cernuda, Neeraj Rajesh and Xian-He Sun. ChronoLog: A Distributed Shared Tiered Log Store with Time-based Data Ordering" Proceedings of the 36th International Conference on Massive Storage Systems and Technology (MSST 2020).

- Hariharan Devarajan, Anthony Kougkas, Luke Logan, and Xian-He Sun. "HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage Environments," 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, Louisiana, USA, 2020.

- Hariharan Devarajan, Anthony Kougkas, Luke Logan, and Xian-He Sun. "HCompress: Hierarchical Data Compression for Multi-Tiered Storage Environments," 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, Louisiana, USA, 2020.

- Anthony Kougkas, Hariharan Devarajan, Jay Lofstead, and Xian-He Sun. "LABIOS: A Distributed Label-Based I/O System", In Proceedings of the ACM 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'19) Best Paper Award

- Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun. "An Intelligent, Adaptive, and Flexible Data Compression Framework", In Proceedings of the IEEE/ACM International Symposium in Cluster, Cloud, and Grid Computing (CCGrid'19)

- Hariharan Devarajan, Anthony Kougkas, Prajwal Challa, and Xian-He Sun. "Vidya: Performing Code-Block I/O Characterization for Data Access Optimization", In Proceedings of the IEEE International Conference on High Performance Computing, Data, and Analytics 2018 (HiPC'18)

- Anthony Kougkas, Hariharan Devarajan, Xian-He Sun, and Jay Lofstead. "Harmonia: An Interference-Aware Dynamic I/O Scheduler for Shared Non-Volatile Burst Buffers" (Slides), In Proceedings of the IEEE International Conference on Cluster Computing 2018 (Cluster'18)

- Anthony Kougkas, Hariharan Devarajan, and Xian-He Sun. "Hermes: A Heterogeneous-Aware Multi-Tiered Distributed I/O Buffering System" (Slides), In Proceedings of the ACM 27th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'18)

- Anthony Kougkas, Hariharan Devarajan, and Xian-He Sun. "IRIS: I/O Redirection via Integrated Storage" (Slides), In Proceedings of the 32nd ACM International Conference on Supercomputing (ICS'18)

- Anthony Kougkas, Hariharan Devarajan, and Xian-He Sun. "Enosis: Bridging the Semantic Gap between File-based and Object-based Data Models", In Proceedings of the ACM SIGHPC Datacloud'17, 8th International Workshop on Data-Intensive Computing in the Clouds in conjunction with SC'17. 

- Anthony Kougkas, Hariharan Devarajan, and Xian-He Sun. "Syndesis: Mapping Objects to Files for a Unified Data Access System", In Proceedings of the ACM SIGHPC MTAGS'17, 10th International Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers, in conjunction with SC'17.

- Hariharan Devarajan, Anthony Kougkas, Hsing-Bung Chen, and Xian-He Sun. "Open Ethernet Drive: Evolution of Energy-Efficient Storage Technology", In Proceedings of the ACM SIGHPC Datacloud'17, 8th International Workshop on Data-Intensive Computing in the Clouds in conjunction with SC'17.

Projects

01

I/O Portability and Performance

Goal of this project is to improve I/O performance by systematically understanding and optimizing large scale workflows executing on supercomputers.
- Hariharan Devarajan and Kathryn Mohror. "Extracting and characterizing I/O behavior of HPC workloads". The 2022 IEEE International Conference on Cluster Computing (CLUSTER'22), September 6-9, 2022, Heidelberg, Germany.
Hariharan Devarajan and Kathryn Mohror. “Mimir: Extending I/O Interfaces to Express User Intent for Complex Workloads in HPC.” 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS'23) St. Petersburg, Florida USA: iEEE, May 2023.

02

IMAI

a user-space distributed library for enables efficient I/O pipeline for Deep Learning Applications. It enables a decoupled and asynchronous data pipeline paradigm.

- Hariharan Devarajan, Huihuo Zheng, Anthony Kougkas, Xian-He Sun, and Venkatram Vishwanath. "DLIO: A Data-Centric Benchmark for Scientific Deep Learning Applications". In 2021 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). IEEE. Best Paper Award
- Hariharan Devarajan, Anthony Kougkas, Huihuo Zheng, Venkatram Vishwanath, and Xian-He Sun, "Stimulus: Accelerate Data Management for Scientific AI applications in HPC," In the proceedings of the 2022 IEEE/ACM International Symposium in Cluster, Cloud, and Internet Computing (CCGrid'22), Taormina, Italy, May 16-19, 2022.

03

Hermes

a new, heterogeneous-aware, dynamic, and distributed I/O buffering system. Hermes enables, manages, supervises and extends I/O buffering to fully integrate into the DMSH.

- Jaime Cernuda, Hariharan Devarajan, Luke Logan, Neeraj Rajesh, Jie Ye, Anthony Kougkas, X.-H. Sun,
HFlow: A Dynamic and Elastic Multi-Layered Data Forwarder”, The 2021 IEEE International Conference on Cluster Computing (CLUSTER'2021), September 7-10, 2021, Virtual Meeting, pp. 114-124, DOI: 10.1109/Cluster48925.2021.00064.
- Neeraj Rajesh, Hariharan Devarajan, Jaime Cernuda Garcia, Keith Bateman, Luke Logan, Jie Ye, Anthony Kougkas, and Xian-He Sun. 2021. "Apollo: An ML-assisted Real-Time Storage Resource Observer". In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing (HPDC '21). Association for Computing Machinery, New York, NY, USA, 147–159. DOI:https://doi.org/10.1145/3431379.3460640
- Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun. "HReplica: A Dynamic Data Replication Engine with Adaptive Compression for Multi-Tiered Storage." 2020 IEEE International Conference on Big Data (Big Data), Atlanta, Georgia, USA, 2020.
- Hariharan Devarajan, Anthony Kougkas, Keith Bateman, and Xian-He Sun. "HCL: Distributing Parallel Data Structures in Extreme Scales." In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2020.
- Hariharan Devarajan, Anthony Kougkas, Luke Logan, and Xian-He Sun. "HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage Environments," 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, Louisiana, USA, 2020.
- Hariharan Devarajan, Anthony Kougkas, Luke Logan, and Xian-He Sun. "HCompress: Hierarchical Data Compression for Multi-Tiered Storage Environments," 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, Louisiana, USA, 2020.
- Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun. "An Intelligent, Adaptive, and Flexible Data Compression Framework", In Proceedings of the IEEE/ACM International Symposium in Cluster, Cloud, and Grid Computing (CCGrid'19)
- Anthony Kougkas, Hariharan Devarajan, and Xian-He Sun. "Hermes: A Heterogeneous-Aware Multi-Tiered Distributed I/O Buffering System", In Proceedings of the ACM 27th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'18)

04

LABIOS

a new, distributed, Label- based I/O system utilizing asynchronous I/O, supports heterogeneous storage resources, with elasticity, and in-situ analytics.

- Anthony Kougkas, Hariharan Devarajan, Jay Lofstead, and Xian-He Sun. "LABIOS: A Distributed Label-Based I/O System", In Proceedings of the ACM 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'19) Best Paper Award

How to Find me

Contacts

We'll answer soon

United States

Address:
Suite 1014, B315,
Lawrence Livermore National laboratory
Livermore CA 94550

E-mail:
hariharandev1@llnl.gov