Projects
![submission_Q3_A.png](https://static.wixstatic.com/media/f02714_d0abd7b50f214411b2f3b677660141ea~mv2.png/v1/fill/w_167,h_125,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/submission_Q3_A.png)
ViT Explainability and Token Analysis for Image Classification
Attention maps generated using the Chefer method and rollout technique illustrate how ViTs attend to different image regions. Introducing register tokens into a ViT-B model enhances feature representation, achieving over 94% accuracy on ImageNet-100. Token norm distributions are analyzed across transformer layers to assess their impact on classification performance. A comparative study between standard and modified ViT models highlights the role of register tokens in improving feature map smoothness and classification reliability.
​
![WhatsApp Image 2025-02-02 at 10.52.33 AM.jpeg](https://static.wixstatic.com/media/f02714_8925efa9208042fbb829a5606e8e31c4~mv2.jpeg/v1/fill/w_148,h_168,al_c,q_80,usm_0.66_1.00_0.01,enc_avif,quality_auto/WhatsApp%20Image%202025-02-02%20at%2010_52_33%20AM.jpeg)
Autonomous Navigation of a Differential Drive Robot
This project focuses on the development and programming of the MBot, a mobile robot equipped with sensors and a Jetson Nano compute module. The robot's perception capabilities were enhanced through Apriltag detection and SLAM, enabling mapping and interaction with its environment.
​
![Screenshot from 2025-02-01 19-24-00.png](https://static.wixstatic.com/media/f02714_d45a04546b22471ca84c75432fb8634b~mv2.png/v1/fill/w_167,h_130,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/Screenshot%20from%202025-02-01%2019-24-00.png)
Denoising Diffusion on Two-Pixel Images
Explores the fundamentals of Denoising Diffusion Probabilistic Models (DDPM) in a simplified two-pixel image space, providing a fully visualizable representation of learned generative distributions. A lightweight conditional UNet is trained to predict noise in the reverse diffusion process, incorporating sinusoidal beta scheduling and classifier-free guidance for controlled image generation.
​
![blockDetection.png](https://static.wixstatic.com/media/f02714_825caf97e4e54657ae90fa994f38a3c3~mv2.png/v1/fill/w_167,h_94,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/blockDetection.png)
Vision-Guided Robotic Manipulation
This project explores the development of a robotic arm capable of detecting and manipulating objects using a combination of kinematic control and computer vision. The control system is built on forward and inverse kinematics, utilizing the product of exponentials approach for precise motion planning.
![WhatsApp Image 2025-02-03 at 5.19.31 PM.jpeg](https://static.wixstatic.com/media/f02714_cf8eee9327734a4ca08d2921e716f473~mv2.jpeg/v1/fill/w_167,h_168,al_c,q_80,usm_0.66_1.00_0.01,enc_avif,quality_auto/WhatsApp%20Image%202025-02-03%20at%205_19_31%20PM.jpeg)
Point Cloud Processing and Segmentation with PointNet
​PointNet is a deep learning architecture designed for processing and segmenting 3D point cloud data. In this project custom dataset loader efficiently handles LiDAR-based point clouds, implementing preprocessing techniques like random downsampling and batch collation. The network architecture extracts hierarchical feature representations using a PointNet encoder and segmentation module, integrating local and global features for improved accuracy
​
![bev_edited.jpg](https://static.wixstatic.com/media/f02714_2541b341065e4b3681dab9a0842234fa~mv2.jpg/v1/fill/w_167,h_165,al_c,q_80,usm_0.66_1.00_0.01,enc_avif,quality_auto/bev_edited.jpg)
Obstacle Aware Planning on BEVFormer Generated Environments with MPC-CBF
This work builds on BEVFormer — a state of the art framework for generating a 2D bird's-eye-view representation of the environment around a vehicle. We use these outputs as the inputs to our planner, which is a model predictive controller augmented with control barrier functions (hence MPC-CBF) for obstacle avoidance.
​