Kunal Pai

Kunal Pai is a computer scientist currently pursuing his M.S. at UC Davis, where he contributes to research at the DArchR Lab. His work focuses on computer architecture, machine learning, and software engineering, with projects ranging from superconductors and cryogenic semiconductor computing simulation to language model enhancements. Kunal has had the opportunity to present his findings at conferences like ICSE and workshops like ModSim and gain practical experience through roles at companies. With a background in various programming languages and frameworks, he enjoys tackling challenges in areas like LLMs and high-performance computing. Kunal is always eager to learn and contribute to the ever-evolving field of computer science.

Education

University of California, Davis
Master of Science
Computer Science
2023 - Ongoing
  • GPA: 4.0
  • Relevant Coursework: Machine Learning, Computer Security, Information Visualization, Software Engineering, Theory of Computation
  • University of California, Davis
    Bachelor of Science
    Computer Science and Engineering
    2019 - 2023
  • GPA: 3.83
  • Provost Scholar, Graduated with Honors
  • Work Experience

    DArchR Lab @ University of California, Davis
    DArchR Lab @ University of California, Davis
    Graduate Student Researcher
    June 2023 - Present
    Davis, CA
    • Leading a team to deliver 10x acceleration in the simulation of cryogenic semiconductor and superconductor computing.
    • Leading a team to develop the first add-on to the gem5 simulator for quantum error correcting codes (QECC).
    • Collaborating with a team to develop an autotuning methodology to deliver 90% correlation between gem5 simulation results and hardware profiling metrics.
    • Mentoring 5 undergraduate students in the Davis Computer Architecture Lab to prepare them for graduate research.
    University of California, Davis
    University of California, Davis
    Teaching Assistant
    September 2023 - December 2023
    Davis, CA
    • Assisted 180 students in a senior-level Probability & Statistical Modeling class.
    humanID
    humanID
    Tech Team Lead
    January 2022 - June 2022
    Davis, CA
    • Delivered 10 completed projects with global teams, including:
    • Documentation of a Discord bot that combats spam and fake users
    • A Django-based web application for permission management for 100 users.
    SiTime Corp.
    SiTime Corp.
    Technical Product Marketing Intern
    July 2021 - September 2021
    Santa Clara, CA
    • Presented strategy to improve distributor margin management and earned profits by $250,000.
    • Conducted a market survey on optical transceivers used in AI networking, to identify customers for MEMS timing chips.
    • Created Visio diagrams for the product requirements document (PRD) of a timing chip.

    Academic Projects

    Automated Frameworks of Semantic Augmentation to Improve Mathematical Word Problem Solving
    April 2024 - June 2024
    NLPPromptingMachine Learning
    • Improved PaLM 2 LLM prompting accuracy on math word problems (MWPs) by 10% and TinyLlama fine-tuning LM accuracy by 60% through a one-shot digit-level semantics framework.
    • Introduced a novel demonstration selection model to improve accuracy of LLMs. Model used BLEU scores and Levenshtein distance to identify the most similar equations for one-shot examples.
    The Effects of Toxicity on Disengagement in Open Source Projects
    January 2024 - March 2024
    Open SourceGitHub MiningData Analysis
    • Found a strong correlation ($R^2 = 0.76$) between high developer engagement in FAANG projects with larger codebases and lower levels of toxicity, offering actionable insights for community management.
    • Quantified toxic behavior using sentiment analysis and mining corporate and non-profit repositories, revealing how toxicity disproportionately impacts new developers compared to experienced ones (up to 1.3x more).
    What is the behavior of Spectre, a speculative prediction exploit, on the various branch predictors available in the computer architecture simulator gem5?
    October 2023 - December 2023
    gem5SpectreComputer Security
    Collaborators: Yuyi Li, Frank Gomez
    • Demonstrated up to a 55% reduction in susceptibility to speculative execution attacks by validating design enhancements like longer training periods and minimizing biased branches for Spectre-resistant branch predictors.
    • Investigated the vulnerability of x86-based in-order and out-of-order processors to Spectre V1 attacks, revealing a strong correlation between branch predictor training periods and attack effectiveness.
    gem5 Vision
    January 2023 - June 2023
    NextJSMongoDBPythonJSON Schema
    • Boosted resource discovery speed by 20x with optimized search functionality across 1,200+ resources.
    • Enabled faster retrieval of resources across 20+ categories by introducing categorization and semantic versioning.
    • Enhanced accessibility for 500+ industry and academic users by integrating local/remote JSON files and MongoDB with gem5.
    QuixFolio
    March 2023 - April 2023
    ReactJSNextJSMaterial UIGitHub PagesGitHub Actions
    Collaborators: Parth Shah, Harshil Patel
    • Developed a portfolio website template with a focus on modularity and ease of use.
    • Implemented a dark mode toggle and a customizable theme switcher to personalize the website.
    • Deployed the website on GitHub Pages and automated the deployment process with GitHub Actions.
    OLED Paint
    May 2022 - May 2022
    CPython
    Collaborators: Steven To
    • Utilized SPI and I2C to create a tilt-based paint application between the CC3200 and Adafruit OLED.
    • Created a webserver and implemented a compression algorithm to transfer 128x128 bitmaps from a board to a computer 2x faster.
    • Conferred Best Lab Project for Spring 2022.
    • Solidified skills of utilizing datasheet information to interact with hardware better.
    UNIfy - Course Assistant
    January 2022 - January 2022
    Discord BotPythonJavaScript
    • Utilized the UC Davis Schedule Builder API to extract class timings and professors.
    • Formulated a class-based hierarchized dictionary to maintain schedules of over 100 server members in five Discord servers.
    • Extracted data from APIs of Rate My Professor and Google Calendar to add additional features to the bot.
    • Solidified skills of good software design to understand and solve problem domain.

    Publications / Talks

    Calibration and Correctness of Language Models for Code
    conference
    Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Sushmit Jha, Premkumar Devanbu, Toufique Ahmed
    ICSE 2025
    Machine learning models often produce incorrect outputs, making reliable confidence measures essential for determining the trustworthiness of these outputs. This paper introduces a framework to evaluate and improve the calibration of code-generating models, finding that these models are generally poorly calibrated initially but can be improved using methods like Platt scaling, thereby enhancing decision-making in software engineering.
    Potential and Limitation of High-Frequency Cores and Caches
    poster
    Kunal Pai, Anusheel Nand, Jason Lowe-Power
    ModSim 2024: Workshop on Modeling & Simulation of Systems and Applications
    The poster presentation explores the potential and limitations of high-frequency in-order and out-of-order cores and caches in modern processors, highlighting the trade-offs between speedups and bandwidth.
    Automatic semantic augmentation of language model prompts (for code summarization)
    conference
    Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
    ICSE 2024
    Adding explicit semantic facts as prompts to Large Language Models improves their performance in code summarization tasks, with notable improvements exceeding 2 BLEU and, in some cases, even surpassing 30 BLEU, demonstrating the effectiveness of this approach in enhancing code analysis and extraction of essential information.
    Validating Hardware and SimPoints with gem5: A RISC-V Board Case Study
    poster
    Kunal Pai, Zhantong Qiu, Jason Lowe-Power
    ISCA 2023: gem5 Workshop
    The poster discusses the development of a RISC-V board model (RISCVMatched) in gem5, along with a methodology for fine-tuning gem5 configurations to closely match real-life systems, resulting in more accurate hardware validation and simulation capabilities.
    gem5 Vision
    poster
    Parth Shah, Kunal Pai, Harshil Patel, Arslan Ali
    ISCA 2023: gem5 Workshop
    The gem5 Vision Project seeks to improve user-friendliness and accessibility by introducing advanced search functionality, comprehensive resource categorization, and expanded database support within the gem5 ecosystem for researchers and developers.

    Skills

    Programming Languages

    Python, C++, Java, JavaScript

    Frameworks

    React, Next.js, TensorFlow, PyTorch, Django, Flask, scikit-learn, pandas, NumPy, Matplotlib

    Tools And Technologies

    Git, Docker, MongoDB, gem5, Unix/Linux, LaTeX

    Languages

    English, Gujarati, Hindi, Spanish

    Awards

    Dean's List

    UC Davis College of Engineering
    Fall 2019

    Dean's List

    UC Davis College of Engineering
    Fall 2020

    Dean's List

    UC Davis College of Engineering
    Winter 2022

    Dean's List

    UC Davis College of Engineering
    Spring 2022

    Provost Award

    UC Davis
    2019-2023