About me

Hi, there!
This is Avani Gupta.
Welcome to my webpage.

I am an AI Engineer at MBZUAI, working on multi-agent learning, LLM evaluation, and agentic systems. My research interests lie at the intersection of machine learning interpretability, reinforcement learning, and natural language processing.

======

News

2025

  • March 2025 - Paper on Prototype-Guided Backdoor Defense accepted to ICCV 2025.
  • February 2025 - Paper on Controllable Concept-Guided Style Transfer accepted to ICVGIP 2025.
  • January 2025 - Work on Building Trust in Clinical LLMs accepted to EMNLP 2025.

2024

  • December 2024 - Presented work on multi-agent learning and evaluation pipelines at the MBZUAI Research Office.
  • February 2024 - Paper Predicting Business Process Events Under Anomalous IT Errors published at CODS-COMAD 2024.
  • January 2024 - Contributed to release of Med42 (70B Clinical LLM) on HuggingFace.

2023

  • September 2023 - Paper Concept Distillation: Leveraging Human-Centered Explanations accepted to NeurIPS 2023.
  • September 2023 - Successfully defended M.S. Thesis at IIIT Hyderabad.

2022

  • December 2022 - Received Best Paper Award and Oral Presentation at ICVGIP 2022.
  • October 2022 - Paper CitRet: Cited Text Span Retrieval accepted to COLING 2022.

======

Work Experience

Jun 2025 – Present: AI Engineer, Research Office - MBZUAI (Abu Dhabi, UAE)

  • Proposed a novel multi-agent reinforcement learning (MARL) training framework (ongoing).
  • Translated real-world industry challenges into research problems and connected them with faculty collaborators.
  • Built intelligent agents for the Research Office: Email Assistant (prioritization, extraction, drafting, calendar integration) and automated newsletter-generation tools.
  • Designed LLM-judge pipelines for QA evaluation (correctness, safety, bias) and developed persona-conditioned synthetic data generation systems for multipersona evaluations.

Apr 2024 – Jun 2025: AI Engineer - Stealth AI Startup (Abu Dhabi, UAE)

  • Trained an end-to-end LLM from scratch: data curation, pre-training, supervised fine-tuning, RLHF.
  • Designed a novel attention mechanism and created pipelines for large-scale synthetic data generation.
  • Built a production-grade AI Assistant with advanced multimodal retrieval (RAPTOR summaries, self-re-RAG using LangChain + FastAPI).
  • Integrated Azure OpenAI + Google Cloud for scalable deployment.
  • Implemented robustness features including content moderation and jailbreak detection.

Mar 2023 – Mar 2024: Research Associate - G42 Healthcare (Abu Dhabi, UAE)

  • Trained a foundation model for patient procedure, diagnosis, and medication prediction.
  • Achieved strong results in chronic disease identification, personalised medicine, mortality and readmission prediction.
  • Built and orchestrated the training dataset for Med42 (10M+ clinical documents).
  • Outcomes: Med42 released on HuggingFace; co-authored the accompanying arXiv paper.

2021 & 2022: Research Intern - IBM Research (Bangalore, India)

  • Worked on forecasting and handling IT errors in business processes - published at CODS-COMAD 2024.
  • Built a system for Goal-Oriented Next Best Action Prediction using Deep RL.
  • Submitted a paper and a US patent (now in final approval stage).

May 2020 – Mar 2023: Researcher - CVIT, IIIT Hyderabad

  • Worked on ML interpretability with Prof. P.J. Narayan.
  • Developed concept-based model evaluation and training methods.
  • Proposed losses for aligning deep models with human-centered abstract concepts.
  • Applied interpretability to debiasing (age–gender classification) and to reconstruction tasks.
  • Worked on Neural Rendering, NeRF, ray tracing, and 3D scene reconstruction.

Jan 2020 – Jan 2021: Independent Study Researcher - CVIT, IIIT Hyderabad

  • Worked on realistic human body reconstruction and temporal stability with Prof. Avinash Sharma.

Jun – Jul 2020: Crew Member and Mentee - Microsoft Mars Colonization Program

  • Built an automated Mars rover game in an agent-centric design.
  • Implemented multiple path-finding algorithms (A, Dijkstra, IDA, JPS, collaborative learning agents).
  • Applied TSP-style optimization for multi-target navigation.

Jan 2020 – May 2020: Applied Deep Learning & Software Engineering Intern - Scrapshut

  • Built a URL authenticity checker using Angular + Django.
  • Trained multiple ML models (LSTM, CNN, XGBoost, Passive-Aggressive Classifier) on fake-news datasets.

Nov 2019 – Jan 2020: RL Researcher - Robotics Research Centre

  • Implemented RL algorithms (MC, PPO, TRPO, DDPG) from scratch.
  • Used OpenAI Gym, RLLib, Vowpal Wabbit; simulators like Gazebo and MuJoCo.