Students' Final Projects for 2025 Fall UVa CS -ML-Undergraduate

Related: Students' projects codebase from this course's past offerings in 2020 and 2019

The course website: https://qiyanjun.github.io/2025Fall-UVA-CS-MachineLearningDeep/

Index of the students' team projects:

As a class, Students built everything from AI systems for medical image analysis and financial fraud detection to local LLM-powered assistants, trash-sorting computer vision models, DJ song-mixing recommenders, chess rating predictors, and even outfit recommendation apps.

Healthcare & Medical AI

Index	Keywords	Video	Summary
t03	ML_gradCam		This project utilizes Grad-CAM to analyze and compare deep learning models like ResNet50, DenseNet121, and a Simple CNN for diagnosing lung diseases from chest X-rays, aiming to increase clinical trust by visualizing the specific features the models focus on.
t06	DiabetesForecasting	link	This project explores the use of machine learning models, specifically LSTM and Random Forest, to forecast diabetes progression by analyzing complex patient health datasets to improve predictive accuracy and clinical decision-making.
t13	CSMedicalImage		This project trains a ResNet-50 CNN on the RSNA chest X-ray dataset to classify pneumonia vs. normal lungs and uses Grad-CAM heatmaps to make predictions interpretable for clinical decision support.
t15	ML-skinI	link	This project proposes using EfficientNet-B0 to classify skin lesions from the HAM10000 dataset, achieving an accuracy of 88% with an 82.92% reduction in parameters compared to traditional ResNet50 models.
t26	MLBrain_tumor		This project trains a 2D U-Net on multimodal BraTS MRI scans to automatically segment glioblastoma subregions (necrotic core, edema, and enhancing tumor) at the pixel level with very high accuracy, reducing manual effort and bias in brain tumor delineation.

Finance & Risk Analytics

Index	Keywords	Video	Summary
t10	MarketMinds-Headlines-to-Returns	link	This project explores the use of FinBERT contextual embeddings compared to traditional market momentum indicators for predicting next-day DJIA movements, ultimately rejecting the hypothesis as complex NLP models underperformed simple technical baselines.
t11	FairCreditPredictionML		This project develops a production-ready, fairness-aware credit scoring system that uses CDI-based proxy groups, reweighing, group-specific thresholds, and SHAP/LIME explanations with full monitoring infrastructure to make more equitable and transparent credit card approval decisions.
t16	CreditCardFraud	link	This project builds and compares a class-imbalance-aware credit card fraud detector using logistic regression (fast baseline with balanced weights) versus a Keras neural network (nonlinear classifier), showing the neural network achieves a far better precision–recall tradeoff on the highly imbalanced European transactions dataset.
t20	FraudulentAccount	link	This project builds a fraud detection pipeline using a balanced random forest trained on a highly imbalanced 1M-row Base Application Fraud dataset, tuned via precision–recall tradeoffs and SHAP analysis to better flag fraudulent bank account applications while controlling review costs.
t24	StockPrice	link	This research project used the finBERT model to analyze the sentiment of over 13,000 financial headlines and found a very weak correlation (less than 0.02) between news sentiment and daily stock price changes.
t27	ETF-risk	link	This project utilizes an interpretable Logistic Regression model and SHAP analysis to provide early-warning signals for dividend instability risk in ETFs, achieving roughly 74% accuracy in predicting whether dividends will fail to grow over a forward-looking 12-month period.

NLP, LLMs & Education Assistants

Index	Keywords	Video	Summary
t07	Local-RAG-Vector-Search-System	link	This project implements a fully local RAG system that indexes user documents with sentence-transformer embeddings and FAISS, then uses an on-device LLaMA model to provide privacy-preserving, semantically grounded document question answering with cited sources.
t12	ML_major-news		This project builds a Reddit-based pipeline that scores post-title sentiment with a tuned TF-IDF + Linear SVM, detects "major events" as weeks with unusually high engagement (robust z-scores over comments/upvotes), and compares pre/during/post sentiment shifts across subreddits to surface interpretable, event-centric insights.
t17	CSAIassistant	link	This project builds a UVA AI Course Assistant that unifies SIS, HoosList, RateMyProfessor, and TheCourseForum data in a vector-backed Gemini chatbot to provide students with personalized course recommendations, schedule planning, and advisor-style guidance with memory.
t18	MicroProgram		This project explores the use of large language models to automatically generate and debug micro-programs for hardware with restricted instruction sets and limited resources, demonstrating that GPT-5 can achieve 100% accuracy in correcting incomplete low-level control sequences when provided with a reference implementation.
t19	canvasGPT	link	CanvasGPT is an Electron desktop app that connects to a student's Canvas account to automatically discover and ingest course data (including linked external sites), unify and semantically index the otherwise unstructured content, and deliver intelligent retrieval plus proactive deadline/update alerts—optionally exposed through an MCP LLM interface for homework assistance.

Computer Vision & Image Processing (Non-Medical)

Index	Keywords	Video	Summary
t02	ML_YoloTrash	link1,link2	This project fine-tunes a YOLOv8-seg model on a synthetically-generated dataset derived from TrashNet to perform real-time instance segmentation of recyclable objects, specifically plastic, in complex real-world environments.
t05	Human-In-the-LoopRL-ImageSynth	link	This project builds a human-in-the-loop reinforcement learning system that learns an individual artist's aesthetic preferences from 1–5 ratings (via Q-learning, Deep Q-learning, and PPO) to steer image generation toward their personal style and increase artist autonomy.
t14	InventoryMonitor	link	This project builds a camera-based fridge monitoring system using a custom CNN and database-backed web app to track items and expiration dates, then generate recipe suggestions and nutrition macros to reduce household food waste.
t21	ML_Image_Colorization_Presentation	link	This project explores image colorization by using a GAN with a U-Net encoder-decoder architecture to infer realistic color channels from grayscale inputs, finding that while the model achieves low L1 loss, objective metrics like SSIM often misalign with the subjective visual quality of the results.
t25	DrunkDriver		This project builds a CNN-based drunk-driving detection pipeline that extracts and crops faces from sober/drunk videos into frames, trains a multi-layer convolutional classifier, and achieves ~95% accuracy as a fast, less-intrusive intoxication screening tool.
t30	Mushroom	link	This project explores the use of CNNs, specifically a baseline CNN and MobileNetV2, to classify mushroom images as edible or poisonous, ultimately achieving an overall test accuracy of 80.67% and a poisonous mushroom recall of approximately 84%.

Sports, Entertainment & Lifestyle

Index	Keywords	Video	Summary
t04	ML_soccerplayer	link	This project utilizes a Random Forest machine learning model trained on FBREF soccer statistics to identify suitable player replacements by analyzing key playing-style attributes and historical performance metrics.
t09	ChessElo	link	This project develops a machine learning regression model that utilizes Stockfish engine evaluations to analyze PGN files and predict chess players' Elo ratings with an average accuracy within 170 points of the actual result.
t22	SA-musicians	link1,link2,link3,link4,link5	This project builds a multi-dimensional artist profiling pipeline that combines transformer-based lyric sentiment, Sentence-BERT theme clustering, cross-platform public-perception sentiment with a RAG QA layer, and SARIMAX forecasting of Spotify popularity trends for six top music artists.
t28	DJ_Mixing_Recommendation_Final	link	This project builds a DJ song-transition recommender that uses Spotify audio features plus DJ mixing rules (±6 BPM beatmatching, Camelot key compatibility, and energy flow) and compares a rule-based filter, an audio-similarity baseline, and an XGBoost hybrid model to rank the top 10 most "mixable" next tracks for any given song.
t31	ML-outfit		This project builds Bundle Buddy, a Random Forest–based, feedback-driven system that uses weather API data and user-reported activity and comfort history to personalize daily outfit recommendations with about 87% accuracy.

Industrial Engineering & ML Theory

Index	Keywords	Video	Summary
t01	ML_Anomaly	link	This project explores and compares unsupervised and semi-supervised hybrid LSTM models, such as LSTM-OC-SVM and LSTM-DBSCAN, to detect energy consumption anomalies in buildings using data-driven analytical techniques.
t08	Autodifferentiation	link	This project explains and implements reverse-mode automatic differentiation by building a small autodiff library, demonstrating it on simple computational graphs, and benchmarking it on a neural network against industry tools.
t23	ML_EngineFailure	link	This project uses the NASA C-MAPSS turbofan dataset to train and compare Random Forest and SVM classifiers that flag aircraft engines within five cycles of failure from time-series sensor and operating-condition data with ~99% accuracy, enabling earlier and more interpretable maintenance decisions.
t29	When_Does_ML_Fail_Presentation		This project systematically stress-tests logistic regression, random forest, and MLP classifiers on the UCI Adult Income dataset under feature noise, label corruption, and distribution shift to reveal how common ML models fail and why accuracy alone can mask their brittleness.

Guide to students: How to PR?

For those who haven't submitted your project code yet, please follow the instructions below to upload your work to the course repository.

Step 1: Set up your local branch

Go to the course repository and click Fork: https://github.com/Qdata4Capstone/uva-machine-learning-25f-projects
Go to your new forked repository and clone it to your local environment:
- git clone https://github.com//uva-machine-learning-25f-projects.git
Navigate into the cloned folder and add the original repository as an upstream remote:
- git remote add upstream https://github.com/Qdata4Capstone/uva-machine-learning-25f-projects.git

Step 2: Prepare your code:

For each team, please create a folder named team-XX corresponding to your team ID (e.g., team-1, team-11, team-111).
Inside this folder, include the following:
- src/: A subfolder containing all source code.
- data/: A subfolder with the data required to reproduce results.
  - Note: If the data cannot be uploaded, include a markdown file describing how to collect it.
- requirements.txt: A file listing required packages. (Format reference)
- README.md: A markdown file describing the folder content. You can view an example here. Your README should include:
  - Project Title
  - Team ID and Members
  - Overview: A brief introduction to the project.
  - Usage: How to run the code to get core results.
  - (Optional) Setup: Instructions for environment setup (if non-trivial).
  - (Optional) Video: A link to your demo video with a brief description.
You are also welcome to include additional files or documentation in the folder or README.md if they help people better understand your project and code.

Step 3: Upload your code

Commit your changes (no requirements on the commit message)
- git add .
- git commit -m "upload project code by Team-XX"
Push the changes to your fork
- git push origin main
On GitHub, navigate to your fork and open a pull request via: Pull requests → New pull request

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
team-01		team-01
team-02		team-02
team-03		team-03
team-04		team-04
team-05		team-05
team-06		team-06
team-07		team-07
team-08		team-08
team-09		team-09
team-10		team-10
team-11		team-11
team-12		team-12
team-13		team-13
team-14		team-14
team-15		team-15
team-16		team-16
team-17		team-17
team-18		team-18
team-19		team-19
team-20		team-20
team-21		team-21
team-22		team-22
team-23		team-23
team-24		team-24
team-25		team-25
team-26		team-26
team-27		team-27
team-28		team-28
team-29		team-29
team-30		team-30
team-31		team-31
.gitignore		.gitignore
README.md		README.md
t01-annotated-ML_Anomaly.pdf		t01-annotated-ML_Anomaly.pdf
t02-annotated-ML_YoloTrash.pdf		t02-annotated-ML_YoloTrash.pdf
t03-annotated-ML_gradCam.pdf		t03-annotated-ML_gradCam.pdf
t04-annotated-ML_soccerplayer.pdf		t04-annotated-ML_soccerplayer.pdf
t05-annotated-Human-In-the-LoopRL-ImageSynth.pdf		t05-annotated-Human-In-the-LoopRL-ImageSynth.pdf
t06-annotated-DiabetesForecasting.pdf		t06-annotated-DiabetesForecasting.pdf
t07-annotated-Local-RAG-Vector-Search-System.pdf		t07-annotated-Local-RAG-Vector-Search-System.pdf
t08-annotated-Autodifferentiation.pdf		t08-annotated-Autodifferentiation.pdf
t09-annotated-ChessElo.pdf		t09-annotated-ChessElo.pdf
t10-MarketMinds-Headlines-to-Returns.pdf		t10-MarketMinds-Headlines-to-Returns.pdf
t11-annotated-FairCreditPredictionML.pptx.pdf		t11-annotated-FairCreditPredictionML.pptx.pdf
t12-annotated-ML_major-news.pdf		t12-annotated-ML_major-news.pdf
t13-annotated-CSMedicalImage.pdf		t13-annotated-CSMedicalImage.pdf
t14-annotated-InventoryMonitor.pdf		t14-annotated-InventoryMonitor.pdf
t15-annotated-ML-skinI.pdf		t15-annotated-ML-skinI.pdf
t16-annotated-CreditCardFraud.pdf		t16-annotated-CreditCardFraud.pdf
t17-annotated-CSAIassistant.pdf		t17-annotated-CSAIassistant.pdf
t18-annotated-MicroProgram.pdf		t18-annotated-MicroProgram.pdf
t19-annotated-canvasGPT.pdf		t19-annotated-canvasGPT.pdf
t20-annotated-FraudulentAccount.pdf		t20-annotated-FraudulentAccount.pdf
t21-annotated-ML_Image_Colorization_Presentation.pdf		t21-annotated-ML_Image_Colorization_Presentation.pdf
t22-annotated-SA-musicians.pdf		t22-annotated-SA-musicians.pdf
t23-annotated-ML_EngineFailure.pdf		t23-annotated-ML_EngineFailure.pdf
t24-annotated-StockPrice.pdf		t24-annotated-StockPrice.pdf
t25-annotated-DrunkDriver.pdf		t25-annotated-DrunkDriver.pdf
t26-annotated-MLBrain_tumor.pdf		t26-annotated-MLBrain_tumor.pdf
t27-annotated-ETF-risk.pdf		t27-annotated-ETF-risk.pdf
t28-annotated-DJ_Mixing_Recommendation_Final.pdf		t28-annotated-DJ_Mixing_Recommendation_Final.pdf
t29-annotated-When_Does_ML_Fail_Presentation.pdf		t29-annotated-When_Does_ML_Fail_Presentation.pdf
t30-annotated-Mushroom.pdf		t30-annotated-Mushroom.pdf
t31-annotated-ML-outfit.pdf		t31-annotated-ML-outfit.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Students' Final Projects for 2025 Fall UVa CS -ML-Undergraduate

Related: Students' projects codebase from this course's past offerings in 2020 and 2019

The course website: https://qiyanjun.github.io/2025Fall-UVA-CS-MachineLearningDeep/

Index of the students' team projects:

Healthcare & Medical AI

Finance & Risk Analytics

NLP, LLMs & Education Assistants

Computer Vision & Image Processing (Non-Medical)

Sports, Entertainment & Lifestyle

Industrial Engineering & ML Theory

Guide to students: How to PR?

About

Uh oh!

Releases

Packages

Contributors 36

Uh oh!

Languages

Qdata4Capstone/uva-machine-learning-25f-projects

Folders and files

Latest commit

History

Repository files navigation

Students' Final Projects for 2025 Fall UVa CS -ML-Undergraduate

Related: Students' projects codebase from this course's past offerings in 2020 and 2019

The course website: https://qiyanjun.github.io/2025Fall-UVA-CS-MachineLearningDeep/

Index of the students' team projects:

Healthcare & Medical AI

Finance & Risk Analytics

NLP, LLMs & Education Assistants

Computer Vision & Image Processing (Non-Medical)

Sports, Entertainment & Lifestyle

Industrial Engineering & ML Theory

Guide to students: How to PR?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 36

Uh oh!

Languages

Packages