About me
Hi, there!
This is Avani Gupta.
Welcome to my webpage.
I am an AI Engineer at MBZUAI, working on multi-agent learning, LLM evaluation, and agentic systems. My research interests lie at the intersection of machine learning interpretability, reinforcement learning, and natural language processing.
======
News
2025
- March 2025 - Paper on Prototype-Guided Backdoor Defense accepted to ICCV 2025.
- February 2025 - Paper on Controllable Concept-Guided Style Transfer accepted to ICVGIP 2025.
- January 2025 - Work on Building Trust in Clinical LLMs accepted to EMNLP 2025.
2024
- December 2024 - Presented work on multi-agent learning and evaluation pipelines at the MBZUAI Research Office.
- February 2024 - Paper Predicting Business Process Events Under Anomalous IT Errors published at CODS-COMAD 2024.
- January 2024 - Contributed to release of Med42 (70B Clinical LLM) on HuggingFace.
2023
- September 2023 - Paper Concept Distillation: Leveraging Human-Centered Explanations accepted to NeurIPS 2023.
- September 2023 - Successfully defended M.S. Thesis at IIIT Hyderabad.
2022
- December 2022 - Received Best Paper Award and Oral Presentation at ICVGIP 2022.
- October 2022 - Paper CitRet: Cited Text Span Retrieval accepted to COLING 2022.
======
Work Experience
Jun 2025 – Present: AI Engineer, Research Office - MBZUAI (Abu Dhabi, UAE)
- Proposed a novel multi-agent reinforcement learning (MARL) training framework (ongoing).
- Translated real-world industry challenges into research problems and connected them with faculty collaborators.
- Built intelligent agents for the Research Office: Email Assistant (prioritization, extraction, drafting, calendar integration) and automated newsletter-generation tools.
- Designed LLM-judge pipelines for QA evaluation (correctness, safety, bias) and developed persona-conditioned synthetic data generation systems for multipersona evaluations.
Apr 2024 – Jun 2025: AI Engineer - Stealth AI Startup (Abu Dhabi, UAE)
- Trained an end-to-end LLM from scratch: data curation, pre-training, supervised fine-tuning, RLHF.
- Designed a novel attention mechanism and created pipelines for large-scale synthetic data generation.
- Built a production-grade AI Assistant with advanced multimodal retrieval (RAPTOR summaries, self-re-RAG using LangChain + FastAPI).
- Integrated Azure OpenAI + Google Cloud for scalable deployment.
- Implemented robustness features including content moderation and jailbreak detection.
Mar 2023 – Mar 2024: Research Associate - G42 Healthcare (Abu Dhabi, UAE)
- Trained a foundation model for patient procedure, diagnosis, and medication prediction.
- Achieved strong results in chronic disease identification, personalised medicine, mortality and readmission prediction.
- Built and orchestrated the training dataset for Med42 (10M+ clinical documents).
- Outcomes: Med42 released on HuggingFace; co-authored the accompanying arXiv paper.
2021 & 2022: Research Intern - IBM Research (Bangalore, India)
- Worked on forecasting and handling IT errors in business processes - published at CODS-COMAD 2024.
- Built a system for Goal-Oriented Next Best Action Prediction using Deep RL.
- Submitted a paper and a US patent (now in final approval stage).
May 2020 – Mar 2023: Researcher - CVIT, IIIT Hyderabad
- Worked on ML interpretability with Prof. P.J. Narayan.
- Developed concept-based model evaluation and training methods.
- Proposed losses for aligning deep models with human-centered abstract concepts.
- Applied interpretability to debiasing (age–gender classification) and to reconstruction tasks.
- Worked on Neural Rendering, NeRF, ray tracing, and 3D scene reconstruction.
Jan 2020 – Jan 2021: Independent Study Researcher - CVIT, IIIT Hyderabad
- Worked on realistic human body reconstruction and temporal stability with Prof. Avinash Sharma.
Jun – Jul 2020: Crew Member and Mentee - Microsoft Mars Colonization Program
- Built an automated Mars rover game in an agent-centric design.
- Implemented multiple path-finding algorithms (A, Dijkstra, IDA, JPS, collaborative learning agents).
- Applied TSP-style optimization for multi-target navigation.
Jan 2020 – May 2020: Applied Deep Learning & Software Engineering Intern - Scrapshut
- Built a URL authenticity checker using Angular + Django.
- Trained multiple ML models (LSTM, CNN, XGBoost, Passive-Aggressive Classifier) on fake-news datasets.
Nov 2019 – Jan 2020: RL Researcher - Robotics Research Centre
- Implemented RL algorithms (MC, PPO, TRPO, DDPG) from scratch.
- Used OpenAI Gym, RLLib, Vowpal Wabbit; simulators like Gazebo and MuJoCo.
