Dhruv Kumar

Email: dhruv@umn.edu

Curriculum Vitae

Research Motto: In theory, there is no difference between theory and practice. But, in practice, there is.

I am currently a Post-Doc Researcher at Microsoft Research. Prior to this, I completed my PhD under Prof. Abhishek Chandra in the Computer Science Department at the University of Minnesota-Twin Cities. During my PhD, I also collaborated with Prof. Ramesh Sitaraman at University of Massachusetts, Amherst. Broadly, I am interested in Distributed Systems and its application in Data Analytics and Machine Learning. Currently, I am working on problems related to Geo-distributed Analytics and also looking to apply data-driven techniques to optimize systems.

I graduated in 2014 with a Bachelors in Computer Science from BITS-Pilani, India. At BITS-Pilani, I was a part of Advanced Data Analytics and Parallel Technologies Lab, where I worked on parallel and distributed algorithms for data mining.

Selected Publications

HACCS: Heterogeneity-Aware Clustered Client Selection for Accelerated Federated Learning [Paper]

Joel Wolfrath, Nikhil Sreekumar, Dhruv Kumar, Yuanli Wang, Abhishek Chandra

in IEEE IPDPS 2022


Towards WAN-Aware Join Sampling over Geo-Distributed Data [Paper]

Dhruv Kumar, Joel Wolfrath, Abhishek Chandra, Ramesh Sitaraman

in ACM EdgeSys 2022


AggNet: Cost-Aware Aggregation Networks for Geo-distributed Streaming Analytics [Paper]

Dhruv Kumar, Sohaib Ahmad, Abhishek Chandra, Ramesh Sitaraman

in ACM/IEEE SEC 2021


Accelerated Training via Device Similarity in Federated Learning [Paper] [Video]

Yuanli Wang, Joel Wolfrath, Nikhil Sreekumar, Dhruv Kumar, Abhishek Chandra

in ACM EdgeSys 2021


Exploiting Data Heterogeneity for Performance and Reliability in Federated Learning [Poster]

Yuanli Wang, Dhruv Kumar, Abhishek Chandra

in ACM/IEEE SEC 2020


DeCaf: Iterative Collaborative Processing over the Edge [Paper]

Dhruv Kumar, Aravind Alagiri Ramkumar, Rohit Sindhu, Abhishek Chandra

in USENIX HotEdge 2019


A TTL-based Approach for Data Aggregation in Geo-distributed Streaming Analytics [Paper]

Dhruv Kumar, Jian Li, Abhishek Chandra and Ramesh Sitaraman

in ACM SIGMETRICS 2019


TTL-based Approach for Data Aggregation in Geo-Distributed Streaming Analytics [Poster]

Dhruv Kumar, Jian Li, Abhishek Chandra and Ramesh Sitaraman

Poster at USENIX OSDI 2018

Education

University of Minnesota-Twin Cities, Minneapolis, United States

PhD in Computer Science (September 2017 - Current)

CGPA: 4.0/4.0

3M Science and Technology Fellowship.

Birla Institute of Technology and Science, Pilani, India

B.E.(Hons.) in Computer Science (August 2010 - May 2014)

CGPA: 9.92/10

Rank 1 in Class of 2014 of Computer Science, comprising 120 students.

Rank 3 in Class of 2014 of BITS-Pilani, comprising 800 students.

Work Experience

Google Cloud, Sunnyvale, California

PhD Intern (June 2019 - August 2019)

Project: Learning to prefetch data for disk servers

  • Identified an efficient strategy for constructing ground truth and proposed a deep neural network based architecture for predicting the prefetch data.
  • Initial evaluation showed an improvement of upto 20% in cache hit rates using the proposed model over the existing approach.


Goldman Sachs, Bengaluru, India

Quantitative Analyst and Software Engineer (November 2014 - April 2016)

  • Improved the efficiency of risk-management system by suggesting improvements to the SQL queries going to Sybase IQ database.
  • Assisted in migrating from Sybase IQ database to MemSQL database for faster access.
  • Implemented a H2-database based server for allowing real-time updates to the tables residing in the servers.
  • Learnt about the real life use-cases of databases.


Other Publications

Simulating Landscape Hydrologic Connectivity in a Precise Manner Using Hydro-Conditioning

Rallapalli Srinivas, Matt Drewitz, Joe Magner, Ajit Pratap Singh, Dhruv Kumar, Yashwant Bhaskar Katpatal

In Advances in Computational Modeling and Simulation. Lecture Notes in Mechanical Engineering. Springer, Singapore. 2022


Hydro-climatic optimized machine learning modeling for water quality assessment of a river based on uncertainty and sensitivity analysis

Srinivas Rallapalli, Ananya Jain,Dhruv Kumar

Under submission in Journal of Cleaner Production, Elsevier, 2021.


Grid-R-tree: a data structure for efficient neighborhood and nearest neighbor queries in data mining

Poonam Goyal, Jagat Sesh Challa,Dhruv Kumar, Anuvind Bhat, Navneet Goyal, Sundar Balasubramaniam

In International Journal of Data Science and Analytics (JDSA), Springer, 2020.


An Efficient method for Batch Updates in OPTICS Cluster Ordering

Dhruv Kumar, Poonam Goyal, Navneet Goyal.

In International Journal of Data Analysis Techniques and Strategies, 2018.

Exact, Fast and Scalable Parallel DBSCAN for Commodity Platforms

Poonam Goyal, Sonal Kumari, Ankit Sood, Dhruv Kumar, Sundar Balasubramaniam, Navneet Goyal

In International Conference on Distributed Computing and Networking (ICDCN), 2017


A fast, Scalable SLINK Algorithm for Commodity Cluster Computing Exploiting Spatial Locality

Poonam Goyal, Sonal Kumari, Sumit Sharma, Dhruv Kumar, Vivek Kishore, Sundar Balasubramaniam, Navneet Goyal.

In IEEE International Conference on High Performance Computing and Communications (HPCC), 2016.


Parallelizing OPTICS for Commodity Clusters

Poonam Goyal, Sonal Kumari, Dhruv Kumar, Sundar Balasubramaniam, Navneet Goyal, Saiyedul Islam, Jagat Sesh Challa.

In International Conference on Distributed Computing and Networking (ICDCN), 2015.


Parallelizing OPTICS for multicore systems

Poonam Goyal, Sonal Kumari, Dhruv Kumar, Sundar Balasubramaniam, Navneet Goyal.

In ACM India Computing Conference (ACM COMPUTE), 2014.