Anthony Wang - Projects and Links to Code

Some of these are linked to my public repositories, other ones are on unindexed repositories (haven't been publically released for various reasons).

Octree Boundary Transfiner: Efficient Transformers for Tumor Segmentation Refinement

Abstract: In this paper, we create a fully autonomous system that segments primary head and neck tumors as well as lymph node tumors given only FDG-PET and CT scans without contrast enhancers. Given only these two modalities, the typical dice score for SOTA models lies below 0.8, below what it would be when including other modalities due to the low resolution of PET scans and noisy non-enhanced CT images. Thus, we seek to improve tumor segmentation accuracy while working with the limitation of only having these two modalities. We introduce the Transfiner, a novel octree-based refinement system to harness the fidelity of transformers while keeping computation and memory costs low for fast inferencing. The observation behind our method is that segmentation errors almost always occur at the edges of a mask for predictions from a well-trained model. The Transfiner utilizes base network feature maps in addition to the raw modalities as input and selects regions of interest from these. These are then processed with a transformer network and decoded with a CNN. We evaluated our framework with Dice Similarity Coefficient (DSC) 0.76426 for the first task of the Head and Neck Tumor Segmentation Challenge (HECKTOR) and ranked 6th.

Link to repository

Medical Image Segmentation with U-Net

I made this as an introductory project when I started doing research at the MAIA lab at UT Southwestern. I was really interested in the Transformer architecture, so I experimented with using it in the classic U-Net architecture for medical image segmentation. The data here consists of 4D CT data of beating hearts (3 spatial dimensions, 1 temporal). The intended outputs of the network are segmented components of the heart. I replaced the convolutional encoder of a classic U-Net with the Transformer and saw improved results. One drawback of using Transformers was that I could only process 2D slices in batches rather than use a 3D Transformer model on the whole image due to memory limitations.

Link to Colab Notebook

Fully Automated Gameplay

In FTC robotics, the robot game consists of a 30-second autonomous portion followed by two minutes of driver-operated gameplay. Traditionally, autonomous consists solely of preplanned paths, navigating around in a predetermined environment. I wanted to push the bounds of what could be done with a standard FTC robot, and I created an autonomous program that could interact with dynamic environments.

I wanted to accurately localize game elements so the robot could navigate to them, so I went through a lot of digging online, found an online textbook on modern computer vision, and used that to implement keypoint homography in OpenCV to localize game elements accurately. I also created a customized queueing algorithm to sort the game elements found to navigate to all of them and score them in optimal time. After A LOT of tuning, the robot was as fast as scoring as human drivers were. I used the RoadRunner path planning library with a customized path follower in this project.

Link to Bitbucket

Plastic Pirate

This was our robot designed to capture plastic waste from the sea for the 2021 FIRST Global Challenge. We settled on a passive plastic waste collection system during the brainstorming phase, but then I suggested an active robot that could avoid collecting detritus that were part of the natural environment and target non-biodegradable plastic waste.

I got to work on designing the electronics and training the machine-learning model for the device. The model we used was based on an existing image model, but I devised a training method by distilling a text-image model and optimizing for related text (minimizing KL divergence between logits from CLIP and my model).

I set up a Raspberry Pi with the necessary config. After testing the computer vision model, I realized the pi was too slow, so I bought a Google Coral ML accelerator and quantized the model so it could run on the accelerator. Byron helped with navigation and networking. This project won the 2021 FGC top award.

Link to Github

Dioxide Detective

I came up with the idea of collective CO2 tracking after watching a YouTube video on how companies got away with having an enormous carbon footprint and how undeniable it was that climate change was tied to rising CO2 levels. I was given the opportunity to expand on this idea and develop it into a more fully-fledged project during the 2022 FIRST Global Challenge.

After discussion with my team, we realized that high-quality CO2 sensors were hard to come by, so I suggested combining a variety of cheaper sensors to estimate the output of a high-quality CO2 sensor. I spent a few weeks assembling the components (temperature, humidity, pressure, and CO2 sensor) onto a Raspberry Pi. My teammates collected high-quality CO2 sensor data alongside the low-quality sensor array. I created a fully custom LSTM model using Tensorflow and deployed it to the Raspberry Pi. Initially, the results were subpar, so I fine-tuned the model and added corrective measures. My teammates created the dashboard.

Not authorized to release yet

Antiddiction

I worked on this app, Antiddiction, as a project manager and backend developer, coming up with the idea and creating the backend logic for processing risk factors and creating a model for predicting the risk of relapse in the future. I found a dataset from the SAMHSA (Substance Abuse and Mental Health Services Administration) for addiction relapse prediction and tried many different methods for predicting relapse from many different factors. I tried a standard feedforward neural network, an SVM, and gradient-boosted decision trees (XGBoost). XGBoost turned out to be the most accurate on testing data despite taking the least time to train.

Link to Colab Notebook

Word Counter Bot

My friend Neil and I made this bot with the intention of counting how many times the word "your mom" was said on our server. Simple at first, the capabilities of this bot quickly grew to represent our ambition and urge to prove our coding prowess. This included NoSQL databases, advanced python OOP, ML-based Language Modeling, big data processing, cloud hosting, CI-CD, and more.

We both worked on the front end side (the UI exposed to the end user on Discord). Neil managed the database side, and I worked on almost everything to do with the ML side of the project, big data processing. I fine-tuned GPT-2 to create a compressed representation of conversations between members of our server.

In the end, the bot could count mentions of any word, filter spam, and generate simulated conversations between members of our server.

This project showed us how far we could take a small idea. It significantly increased our confidence and skill in coding and was what got us started in hackathons.

Link to Github
Link to Colab

Reinforcement Learning Experiment

I wanted to foray into reinforcement learning with this project after seeing the success of reinforcement learning in Go and video games. I first coded a reinforcement learning algorithm from scratch using guides from the internet, but it wasn't very effective at navigating to points. I then used OpenAI's stable-baselines implementation of proximal policy optimization to train my agent, and the new attempt showed significant improvements. Unfortunately, the project fell through when I realized the powerful hardware I needed to run my algorithm on the robot was not fully in compliance with the official FIRST game rules. Nevertheless, this was still a monumental project for me because of the sheer complexity and time I spent programming or debugging, as well as how much I learned about reinforcement learning and policy optimization.

Link to Colab Notebook

MNIST

In my freshman year, I stumbled across 3Blue1Brown's video series on machine learning, and I was instantly drawn to machine learning. I wanted to make this just because I was so curious about machine learning, and I couldn't have worked on a better starter project. I made everything from scratch (other than data loading code) with only the standard library in C++.

This way, I was able to learn the minutiae of machine learning with nothing hidden behind an abstraction layer. I learned and implemented backpropagation, activation functions, and learning rate schedules. I struggled to train this basic MLP to convergence on a surface pro, but after spending a few weeks playing with hyperparameters and adding momentum to the model, I trained a model with 93 percent accuracy after several hours of training.

This ground-up project was inefficient in its implementation but essential to my deep understanding of the complex models I work with these days.

Link to Github

Blitz

This small project helps users construct paths during the autonomous period in FTC robotics. It's a straightforward drag-and-drop UI built with Scratch to interface with the RoadRunner code API for path planning. I built this with Scratch, then exported it to a standalone HTML5 webpage, and hundreds of teams now use it to create their autonomous paths. This might have been the only project I've done where using Scratch to code actually saved time.

With this tool, any user can create a competitive autonomous path in 1-2 minutes with only half an hour of training, whereas in the past, directly coding a path could take hours, followed by hours of testing and debugging. I learned from this project that a seemingly simple approach combined with the right tools could effectively solve problems.

Link to Page

Laser Relocalization System

This is a system that uses time-of-flight sensors and infrared lasers to localize our robot on the FTC playing field. During the 2021-2022 FTC robotics season, field obstacles made it impractical to have unpowered wheels pressed into the ground for localization. Additionally, the encoders on the drive wheels drifted over time due to the wheels slipping on the field tiles. To overcome these challenges, I came up with the idea to use distance sensors combined with the IMU (angular rotation sensor) to figure out where the robot is. This image here shows the basic case, but I had to account for many, many edge cases. I also had to learn the inner workings of a Kalman filter for sensor fusion. In order to make this work, I consulted engineers from DEKA research who guided me on implementing the algorithm (I used Apache Common Math for the number crunching). In the end, it was a success, after edge cases (such as the lasers hitting the same wall) were taken into account.

Link to Bitbucket