I am a AI Engineer, passionate about transforming consumer experiences with the aid of computer graphics, computer vision and deep learning.
Hi there! I’m all about bringing ideas to life with computer vision, 3D graphics, and deep learning. From addressing real-world challenges in healthcare to creating impactful solutions, I’ve had a blast blending tech and creativity through projects and internships. I’m a big fan of AI, designing, and coding—basically, anything that lets me innovate and make life a little cooler!
University of Toronto
2023 - 2025
Vellore Institute of Technology
2019 - 2023
Feb 2025 - Current
Leading the deployment of an ML pipeline for a realistic and interactive 3D anatomical modeling system using patient-specific CT/MRI scans. Helping with the development of an audio-to-text element for an emergency response website using Large Language Models.
May 2024 - Dec 2024
Led the development of an interactive 3D anatomical modeling system from patient-specific CT scans to revolutionize surgical planning. Implemented an nnU-Net-based segmentation algorithm to accurately generate 3D visualizations of segmented organs. Enhanced the realism of rendered anatomical models by optimizing and applying a GAN-based texture generation algorithm.
Sep 2023 - Apr 2024
Mentored students in Python programming across diverse disciplines, simplifying complex concepts through relatable analogies and personalized problem-solving strategies. Adapted teaching methods to accommodate varied backgrounds, fostering a clear understanding for learners in management, psychology, computer science, and beyond.
May 2022 - Jul 2022
Designed a lead scoring prediction model using Random Forests, Logistic Regression, and Deep Neural Networks, achieving high accuracy in identifying potential buyers from website activity. Performed exploratory data analysis and feature engineering, visually presenting key customer conversion factors beneficial to the stakeholders.
An automated system for manga colorization and style conversion to enhance readability and ease artists' workload. Implements a Pix2Pix conditional GAN in PyTorch with a CNN-based discriminator and U-Net generator for colorizing black-and-white manga pages. Fine-tunes a pre-trained Stable Diffusion model for manga style transfer across four distinct art styles.
A Python-based 3D Gaussian Splatting segmentation model that leverages LangSAM for text-driven 3D segmentation. Incorporates an optimized prompt initialization strategy using K-means clustering for efficient view selection and point sampling. Reduces computational requirements by achieving near-optimal results with only 50% of the input data.
A computer vision system for tracking player movements and classifying badminton strokes in broadcast videos. Utilizes Particle Filter and custom jersey color detection for player tracking with 99% accuracy. Predicts badminton strokes using CNNs with 81% accuracy. Detects court boundaries through image binarization, edge detection, Hough Lines, and K-Means clustering.
A Flask-based web application with HTML and CSS for solving image-based handwritten polynomial equations. Performs image segmentation and preprocessing to isolate numerical values and symbols. Implements a CNN model using TensorFlow-Keras and OpenCV to detect handwritten numbers and symbols with 98% accuracy, enhancing usability for students.
An automated system for manga colorization and style conversion to enhance readability and ease artists' workload. Implements a Pix2Pix conditional GAN in PyTorch with a CNN-based discriminator and U-Net generator for colorizing black-and-white manga pages. Fine-tunes a pre-trained Stable Diffusion model for manga style transfer across four distinct art styles.
A Python-based 3D Gaussian Splatting segmentation model that leverages LangSAM for text-driven 3D segmentation. Incorporates an optimized prompt initialization strategy using K-means clustering for efficient view selection and point sampling. Reduces computational requirements by achieving near-optimal results with only 50% of the input data.
A computer vision system for tracking player movements and classifying badminton strokes in broadcast videos. Utilizes Particle Filter and custom jersey color detection for player tracking with 99% accuracy. Predicts badminton strokes using CNNs with 81% accuracy. Detects court boundaries through image binarization, edge detection, Hough Lines, and K-Means clustering.
A Flask-based web application with HTML and CSS for solving image-based handwritten polynomial equations. Performs image segmentation and preprocessing to isolate numerical values and symbols. Implements a CNN model using TensorFlow-Keras and OpenCV to detect handwritten numbers and symbols with 98% accuracy, enhancing usability for students.
Conference: Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) 2025
Author(s): Hoseok Seo, Anannya Popat, Caterina Masino, Sojung Kim, Han Hong Lee, Kyo Young Song, Amin Madani
Fine-tuned DenseNet201 model to classify histologic types in early gastric cancer from endoscopic images. Preprocessed a dataset of 2,944 labeled images, achieving 93.4% training accuracy and 74.0% internal validation accuracy on default and ROI-cropped images.
To Be PublishedConference: International Conference on Machine Learning and Data Engineering (ICMLDE) 2022
Author(s): Anannya Popat, Lakshya Gupta, Gaowri Naratha Meedinti, Dr. Boominathan Perumal
An image-based movie genre classification algorithm leveraging Federated Learning to ensure data privacy in graphics industry. Designed a decentralized architecture with 81% accuracy for local CNN training with distributed data, reducing storage requirements and ensuring privacy.
ElsevierJournal: Multimedia Tools and Applications 2023
Author(s): Sudha SenthilKumar, K. Brindha, Jyotir Moy Chatterjee, Anannya Popat, Lakshya Gupta, Abhimanyu Verma
This paper introduces a web-based system that uses an enhanced Inception V4 CNN to recognize and solve handwritten polynomial equations (cubic, quadratic, and quintic) by determining the value of 𝑥 x. The model is trained on data from MathNet (arithmetic symbols), MNIST (digits), and EMNIST (alphabet characters).
SpringerConference: Advances in Data-Driven Computing and Intelligent Systems (ADCIS) 2022
Author(s): Gowri Namratha Meedinti, Anannya Popat, Lakshya Gupta, Boominathan Perumal
Proposed a privacy-preserving approach using Federated Learning and auto-encoding to train a camera filter for generating sketched representations of images. The method ensures data security while leveraging the CUFS database for training, addressing privacy concerns in applications like medical imaging, remote sensing, and e-commerce.
SpringerConference: Information Systems for Intelligent Systems, Proceedings of ISBM 2022
Author(s): Lakshya Gupta, Gowri Namratha Meedinti, Anannya Popat, Boominathan Perumal
Implemented a Federated Learning (FL) approach for privacy-preserving music genre classification using CNNs and the GTZAN dataset. The method ensures data discretion and copyright protection for music corporations in large-scale collaborative machine learning projects.
Springer