Volleyball Rally Outcome Prediction

An LSTM that predicts who wins a volleyball rally from its play-by-play

Built with — PyTorch, Python, LSTM, Pandas, NumPy

View Repository

For my APS360 (Applied Fundamentals of Deep Learning) final project, I trained a model to predict which team wins a volleyball rally from the sequence of actions within it: serve, reception, set, attack, and block.

Because a rally is inherently sequential (each contact constrains the next), I framed it as a sequence classification problem and built a two-layer LSTM with learned embeddings for each action type. I trained it on the VREN dataset of ~1,500 annotated rallies from professional and NCAA Division I games, with a feedforward network as a baseline to beat.

The results:

To check that the model learned real volleyball patterns rather than memorizing the dataset, I hand-annotated 30 rallies from the 2025 NCAA Division I Men's Final (Long Beach State vs. UCLA), data the model had never seen. It held up at 80% accuracy and a 0.891 AUC, which I was really happy with given the different teams, annotation style, and tiny sample.

This was a solo project, and it was a great deep dive into sequence modeling, regularization in small-data regimes, and how much careful data work matters compared to raw model size.