Portfolio

School Projects

FastFrida

January 2025 - Present

Reinforcement Learning Robot Painter
Master's Student Research
CMU Robot Intelligence Group (BIG)

Skills applied:

Reinforcement Learning
Curriculum Learning
Soft Actor Critic
Computer Vision

FastFRIDA is a reinforcement learning-based system that translates images and text descriptions into expressive brushstroke sequences for robotic painting. The project builds on the FRIDA framework, replacing its slow optimization loop with a learned policy trained via imitation learning and Soft Actor-Critic (SAC). The model observes a canvas, target image, prompt, color palette, and stroke count, and outputs a probabilistic stroke distribution at each timestep. A custom reward function balances CLIP similarity, pixel accuracy, and paint efficiency. While SAC requires careful initialization from imitation learning, the project demonstrates early-stage results and lays the foundation for real-time robotic painting with future improvements including transformer-based models and causality-aware architectures.

The Actor structure is on the left and the SAC training structure is on the right.

Bittle MPC

Resource Constrained Model Predictive Control for Gait Design Without Force Sensing

January 2025 - May 2025

Optimal Control and Reinforcement Learning
Class Project

Skills applied:

Model Predictive Control
Mujoco
URDF/MJCF Debugging
Quadrupedal Locomotion

We developed a model predictive control framework for quadrupedal gait generation on the Bittle robot, a palm-sized platform powered by an ESP32 microcontroller. Our approach avoids reliance on force sensors, instead using simulation-based optimization via MuJoCo’s MJPC framework to generate stable gaits offline. These trajectories are adapted for real-time onboard execution using TinyMPC, a lightweight solver designed for resource-constrained systems. This pipeline demonstrates that efficient, stable locomotion can be achieved on low-cost hardware with limited sensing and computation.

Stable gaits were able to be optimized using MJPC. These gaits were then successfully adapted to TinyMPC in Mujoco. The graphs show a walking gait from MJPC and a standing trajectory from TinyMPC, respectively.

NeurMPC

A Data-Driven Closed Loop Model Predictive Control
method for Selective Neural Stimulation

April 2025 - May 2025

Data Driven AI for Modeling and Control of Dynamical Systems with Applications to Neural Data
Class Project

Skills applied:

Model Predictive Control
Neuron Simulator
Interdisciplinary applications of previous knowledge

This project explored the use of Model Predictive Control (MPC) for closed-loop neurostimulation to treat treatment-resistant depression by targeting dysregulated frontal cortex activity. Motivated by research showing increased firing rates of Pyramidal (Pyr) and Parvalbumin (PV) neurons in depressive states, we developed an algorithm that runs on the NEURON simulator and optimizes biphasic stimulation waveforms to restore healthy firing patterns. Using MPC, the system predicts future neuronal states and computes optimal stimulation inputs to minimize deviation from desired activity. Results demonstrated the feasibility of this approach, with identified limitations and future directions focusing on improved biophysical modeling, real-time adaptation, and multi-neuron interactions.

The antagonistic behavior was simulated by using Bayes Optimization to find the parameters that produced the unwanted behavior in the neurons. MPC was then run to correct for this behavior.

While the results we got from this project were not what we had hoped for, we believe there is potential in using this with some future work:

1. Model multi-neuron interactions
Better capture PV and Pyr coupling dynamics.
Modeling interactions between the types of neurons could reveal cruical control information

2. Upgrade simulator state representation
Having a simulator that can output continuous soma voltage would allow for finer control.

Robot Learning

H1 unitree using transformer

March 2024 - May 2024

Class Project for Intro to Robot Learning

Skills applied:

Gymnasium
Reinforcement Learning
TDMPC/DDQN/Dreamer
Transformers

In response to challenges in teaching humanoid robots complex tasks, our project aims to enhance robot performance through hierarchical learning and decision transformers. By integrating a different low-level control policy and leveraging decision transformers' ability to learn from entire trajectories, we aim to achieve successful dexterous manipulation tasks or complex whole-body control maneuvers, advancing humanoid robot capabilities.

TDMPC:
TDMPC leverages a distribution over trajectories to perform model predictive control, enabling robust decision-making under uncertainty by optimizing over likely future outcomes. It combines planning and control using learned or analytic dynamics.
Dreamer:
Dreamer is a model-based reinforcement learning algorithm that learns a latent dynamics model and uses it to imagine future trajectories, optimizing policies entirely in the latent space for sample-efficient learning.
DDQN:
DDQN improves upon the standard DQN by decoupling action selection and evaluation, reducing overestimation bias and enabling more stable value-based learning in discrete action spaces.

Poker AI

Poker Bot

March 2024, March 2025

Coded in 24 hours for a CMU Hackathon

Skills applied:

Gymnasium
Reinforcement Learning
Generalized Advantage Estimation (GAE)

During a hackathon hosted by the CMU Poker club, I developed a Python bot trained using reinforcement learning, specifically the Generalized Advantage Estimation (GAE) method, to play a variant of poker. The bot exhibited superior performance, surpassing a baseline bot using basic probability logic by $20,000 over 1000 rounds. This outcome underscores the efficacy of reinforcement learning in mastering strategic decision-makin g in dynamic and u ncertain environments like poker. Moving forward, further enhancements could optimize the bot's performance for diverse gaming scenarios.

Find out more here.

The games of poker were played using the following limited rule set:

1000 rounds of 1v1 played
$500 starting pot which reset each round
3 suites with 9 cards each, making a 27 card deck
2 cards in hand with 2 shard cards in the river

General Advantage Estimation (GAE):

General Advantage Estimation involves using two neural networks to choose which action is the best to take given a state and then evaluate that action. The algorithm

Network Attack Prediction

Big Data ML Project using MQTT Dataset

September 2023 - December 2023

Final Course Project for Systems and Toolchains for AI

Skills applied:

DynamoDB
SQL Database
Pyspark data pre-processing
Machine Learning Algorithms

I n this group project, I used the MQTT dataset from Kaggle and integrated it into a Postgres table. I then used python to do some initial analysis on the table via Pyspark and preprocessed the data to prepare it for use in a machine learning model. I then compared two separate models - Lin ear Regression and Random Forest - to analyze the data. Each of the models were run and tuned using both Pyspark and TensorFlow. Finally, I ran all of the models using Google Cloud Compute and compared all of their effectiveness at predicting malicious network traffic.

MQTTset (kaggle.com)

"The [data set] is related to a smart home environment where sensors retrieve information about temperature, light, humidity, CO-Gas, motion, smoke, door and fan with different time interval since the behaviour of each sensor is different with the others."

This dataset was chosen for its applications to the real world, where information from various sensors or inputs will need to be interpreted to gleam some insight not readily available to the human eye.

Data preprocessing was done using a custom tranformer pipeline using pyspark that consisted of several stages in the following order:

Imputing
Type casting
One hot encoding + string indexing
Output classification/numbering
Vector assembly for input features
Scaling
Any remaining NaNs were dropped

Five different ML models were selected and tested on the vectorized data: a Logistic Regression (LR) model, a Random Forest (RF) Decision Tree model, both a shallow and a deep Neural Network (NN), and an autogenerated model created using Edge Impulse. Both the Logistic Regression and Random Forest models underwent hyperparameter tuning, where the regularization parameter and max iterations were tuned for the LR model and the maximum tree depth and number of trees were tuned for the RF model. The neural networks were tuned and ended up with the deep one consisting of 4 layers of 128 neurons, with a learning rate of 0.005 decaying at a rate of 0.995 and 20 epochs. The shallow NN consisted of two layers of 8 neurons with a learning rate of 0.05 without decay for 10 epochs. Finally, Edge Impulse AI was used to upload the data and create one final model to compare to.

Algorithm	Testing Data Accuracy	Rank
Linear Regression	83.11%	5
Random Forest Decision Tree	86.76%	2
Deep Nueral Network	83.21%	4
Shallow Neural Network	83.31%	3
Edge Impulse	95.2%	1

Tesla Model 3 Controllers

Controlling a Tesla Model 3 in Webots using various control algorithms to navigate a given track

November 2023 - December 2023

Modern Control Theory Project

Skills applied:

Python
Webots simulator
Scipy Control Package
PID Control
Pole Placement
LQR
Kalman Filter
Adaptive Control

This project was a 4 part project for Modern Control Theory. The objective for this part was to complete the track given nothing but a set of waypoints and the current state of the Tesla, with varying information for each part.

The only inputs we had to the model were acceleration and steering. Fo r each controller, the control was split into a lateral (steering) and a longitudinal (accelerator) controller.

Initially, the loop needed to be completed in under 400 seconds with an average error from the track under 5 meters and a maximum error under 10 meters. These metrics got more strict for each part of the project. The controller was developed in python and tested using a Webots simulation of the Tesla.

PID Results

Metric	Required	Result
Lap Time	400 s	94.3 s
Max Distance off Track	10.0 m	7.38 m
Average Distance off Track	5.0 m	0.73 m

Pole Placement Results

Metric	Required	Result
Lap Time	350 s	90.5 s
Max Distance off Track	9.0 m	8.93 m
Average Distance off Track	4.5 m	3.48 m

LQR Results

Metric	Required	Result
Lap Time	250 s	85.4 s
Max Distance off Track	7.0 m	6.79 m
Average Distance off Track	3.5 m	0.84 m

EKF Results

Metric	Required	Result
Lap Time	250 s	85.9 s
Max Distance off Track	7.0 m	6.77 m
Average Distance off Track	3.5 m	0.82 m

As you can see, the LQR and EKF SLAM algorithms had almost identical results, depite the EKF SLAM algorithm not having direct access to the vehicle's states. Both of these algorithms far outperformed the PID and Pole Placement algorithms.

Personal Projects

DIY Projects

Bitcoin Price Predictor

Bitcoin Stock Price Predictor

May 2023 - August 2023

Skills applied:

TensorFlow Keras
Machine Learning
Neural Networks
API research

Data was pulled from Binance using their API and split into 70% training, 20% validation, and 10% validation data. The data was windowed so each data point consisted of the Open, Close, Maximum, and Minimum prices along with the volume every 12 hours for the past 7 days. In order to decide what was the best model to use for bitcoin price prediction, I created a few seperate Sequential keras models. A linear, 3 layer Dense, 1D Convolution, LSTM, and redsidual LSTM model were all trained for 50 epochs. The RMSE of each model was then compared for the training, validation, and testing data sets.

1D Conv

Multi-Step Dense

RMSE

Residual LSTM

LSTM

Linear

Baseline

LSTM fit the model the best, so in the future I will be attempting to use this model and the bitcoin API to buy and sell a small sum of bitcoin to test this models effectiveness in real life.

Project Portfolio

School Projects

FastFrida

FastFrida

January 2025 - Present

Reinforcement Learning Robot PainterMaster's Student ResearchCMU Robot Intelligence Group (BIG)

Skills applied:

Bittle MPC

Resource Constrained Model Predictive Control for Gait Design Without Force Sensing

January 2025 - May 2025

Optimal Control and Reinforcement LearningClass Project

Skills applied:

NeurMPC

A Data-Driven Closed Loop Model Predictive Control method for Selective Neural Stimulation

April 2025 - May 2025

Data Driven AI for Modeling and Control of Dynamical Systems with Applications to Neural DataClass Project

Skills applied:

Robot Learning

H1 unitree using transformer

March 2024 - May 2024

Class Project for Intro to Robot Learning

Skills applied:

Poker AI

Poker​ Bot

March 2024, March 2025

Coded in 24 hours for a CMU Hackathon

Skills applied:

The games of poker were played using the following limited rule set:

General Advantage Estimation involves using two neural networks to choose which action is the best to take given a state and then evaluate that action. The algorithm

Network Attack Prediction

Big Data ML Project using MQTT Dataset

September 2023 - December 2023

Final Course Project for Systems and Toolchains for AI

Skills applied:

Tesla Model 3 Controllers

Controlling a Tesla Model 3 in Webots using various control algorithms to navigate a give​n track

November 2023 - December 2023

Modern Control Theory Project

Skills applied:

PID Results

Pole Placement Results

LQR Results

EKF Results

As you can see, the LQR and EKF SLAM algorithms had almost identical results, depite the EKF SLAM algorithm not having direct access to the vehicle's states. Both of these algorithms far outperformed the PID and Pole Placement algorithms.

Personal Projects

DIY Projects

Bitcoin Price Predictor

Bitcoin Stock Price Predictor

May 2023 - August 2023

Skills applied:

LSTM fit the model the best, so in the future I will be attempting to use this model and the bitcoin API to buy and sell a small sum of bitcoin to test this models effectiveness in real life.

Reinforcement Learning Robot Painter
Master's Student Research
CMU Robot Intelligence Group (BIG)

Optimal Control and Reinforcement Learning
Class Project

A Data-Driven Closed Loop Model Predictive Control
method for Selective Neural Stimulation

Data Driven AI for Modeling and Control of Dynamical Systems with Applications to Neural Data
Class Project

Poker Bot

Controlling a Tesla Model 3 in Webots using various control algorithms to navigate a given track