About me

Hi, I’m Avani Gupta.

I take inspiration from human learning to build autonomous systems. My research aim is to build efficient (both sample-efficient and generalizable), interpretable, and aligned autonomous systems. I want to solve problems to make this world a better place for everyone and advance human understanding and knowledge.

Apart from work, I am a fitness freak and try to do regular gym. In my free time, you would find me reading philosophy, scientific studies, and watching documentaries/series. I used to draw and paint as well earlier [will post my works soon].

Here is my CV if you need one: Curriculum Vitae


News

2025

  • October 2025 — Paper on Controllable Concept-Guided Style Transfer accepted to ICVGIP 2025.
  • September 2025 — Work on Building Trust in Clinical LLMs accepted to EMNLP 2025.
  • July 2025 — Paper on Prototype-Guided Backdoor Defense accepted to ICCV 2025.

2024

  • January 2024 — Paper Predicting Business Process Events Under Anomalous IT Errors published at CODS-COMAD 2024.
  • December 2023 — Contributed to release of Med42 (70B Clinical LLM) on HuggingFace.

2023

  • September 2023 — Paper Concept Distillation: Leveraging Human-Centered Explanations accepted to NeurIPS 2023.
  • September 2023 — Successfully defended my Master’s Thesis at IIIT Hyderabad.

2022

  • December 2022 — Received Best Paper Award and Oral Presentation at ICVGIP 2022 for work on concept-based disentanglement.
  • October 2022 — Paper CitRet: Cited Text Span Retrieval accepted to COLING 2022.

Work experience

  • Jun 2025 * Present: MBZUAI — AI Engineer, Research Office
    • Abu Dhabi, UAE

      • Proposed a novel multi-agent reinforcement learning (MARL) training framework (ongoing).
      • Bridged real-world industry challenges into research problems; identified and connected with faculty collaborators to integrate robust academic and practical insights.
      • Built intelligent agents for the Research Office: including an Email Assistant (handling prioritization, information extraction, drafting, and calendar integration) and automated newsletter-generation tools.
      • Designed LLM-judge pipelines for evaluating QA (correctness, safety, bias) and developed persona-conditioned synthetic data generation systems for scalable, multipersona evaluations.
  • April 2024 * Jun 2025: AI Engineer
    • Stealth AI Startup | Abu Dhabi, UAE

      • Trained a LLM end to end from scratch for advancing SOTA: from data curation to pre-training to post-training (Supervised Fine-Tuning and Alignment using RLHF)
      • Designed LLM with novel attention mechanism, pipelines for data generation.
      • Built an AI Assistant with various tools including advanced multi-modalretrieval, utilizing RAPTOR for document summaries and self-re-RAG (using LangChain, FastAPI)
      • Productionized the AI Assistant with Azure OpenAI and Google cloud.
      • Tackled multiple challenges in the AI assistant and built components like content moderation and jailbreak attempts flagging to ensure a robust deployed system.
  • March 2023 * March 2024: Research Associate
    • G42 Healthcare | Abu Dhabi, UAE

      • Trained a foundation model from scratch in a novel setting to predict procedures, diagnosis and medications for patients given medical history and demographics.
      • Used it for chronic disease identification, mortality prediction,re-admission prediction and personalised medicine.
      • Orchestrated training dataset (from 10M+ articles) and evaluation of Clinical LLM. Outcomes: Med42 released on HuggingFace and authored paper.
  • 2021, 2022 Research Intern
  • IBM Research | Bangalore, India

    • Worked on forecasting and handling IT errors in Business Processes: paper

    • Research project on building system for Goal Oriented Next Best Action Prediction in Business Processes using Deep Reinforcement Learning.
    • Submitted Paper and US. patent (currently in last stage after signing)
  • May 2020 * March 2023: Researcher
    • CVIT, IIIT Hyderabad

      • Worked on ML Interpretability applied in Computer Vision and Graphics under Professor P.J. Narayan.
      • Developed novel interpretability based model evaluation and training methods.
      • Used human centered abstract concepts for model disentanglement evaluation and finetuning via a proposed loss function.
      • Concepts helped to align model with human understanding thereby improving model generalization.
      • Used concepts to debias for complex biases like age in gender classification and induce prior knowledge in a real-world reconstruction problem
      • Also worked in Neural rendering, ray tracing and 3D reconstruction of objects and scenes. Studied NeRF(Neural Radiance fields) line of work. Implemented and reproduced results of several papers in neural rendering.
  • Jan 2020* Jan 2021: Independent Study Researcher
    • CVIT, IIIT Hyderabad

      • Worked with Prof. Avinash Sharma on Realistic 3D Human reconstruction from images to create realistic digital avatars (PeelHuman)
      • Worked on animation/video generation human 3D human avatars with an aim to ensure temporal consistency and good transitions in loose clothing.
  • June * July 2020: Crew Member and Mentee
    • Microsoft | Mars Colonization Program

      • Worked on Automated mars rover web game.
      • Developed the game in Agent Centric way.
      • Used shortest path-finding algorithms like Collaborative Learning Agents, A, Dijkstra, Best first search, IDA, Jump-Point Finders and their bi-directional forms to make the AI rover navigate the mars.
      • Applied Travelling salesmen algorithm and made the AI agent render multiple destinations in the shortest path avoiding all obstacles. *Built using Object Oriented programming cocepts. Used Jquery, Rafael.js, and HTML, CSS and javascript.
  • Jan 2020* May 2020: Applied Deep Learning and Software Engineering Intern
    • Scrapshut | Hyderabad

      • Developed a web-app using Angular and Django where users can check genuineness of any site by providing it’s URL and get other user’s reviews along with predictions by DL model.
      • Trained various Deep Learning models like LSTM, XGBoost and CNN on three datasets* Kaggle fake news net, Kaggle: getting real about fake news and Kaggle fake news Prediction.
      • Also trained a passive aggressive classifier (online learning algorithm) and incorporated user-rated scraped reviews for real time prediction.
  • Nov 2019* Jan 2020: RL Researcher
    • Robotics Research Centre

      • Worked with Professor Madhav Krishna to explore several SOTA RL algorithms in Robotics and Control.
      • Implemented algorithms like PPO, TRPO, DDPG etc from scratch.
      • Also explored open AI gym, RLib, Vowpall wabbit and engines like Gazebo, Mojuco for control in robotics.