Current Projects

Check out what we're currently working on.

Kaggle March Machine Learning Mania 2018

The club will use NCAA Men's Basketball data from 1985-2017 to build and test models for predicting percent chances in head to head results for each team in the 2018 NCAA Division I Men's Basketball Tournement. This will be compared to the actual results of the 2018 tournament and graded by an average log loss calculation across all games played. A percentage confidence rating must be given for each possible matchup, despite only 63 games occuring.

In 2017, our Sports Analytics club teams came in 67th and 356th respectively out of 442 submissions.

NFL Player Draft/Success Correlation

This project will attempt to generate a statistical metric to judge players "Success" similar to the goal of Sports Reference's approximate value statistic. We will then look at the correlation between draft position and success to answer the question "Are NFL players with a high draft position, more successful than those drafted in later rounds on average?" and create a model predicting player success based on draft position. This model could also predict performance in a variety of individual player statistics.

NFL Strategy of Playcalling Extra Point vs. Two Point Conversion

This project takes an in depth look at playcalling of two point conversions and extra points. After the 2015 NFL rule change, which moved the extra point back to 15-yd line, teams are going for two point conversions more often. We explore the distributions and expected value of both post-touchdown options based on data gathered from NFL games.

Results Thus Far: After comparing results leaguewide, it was concluded that both strategies result in about the same points gained, but differ in standard deviation enough where the extra point is the safer decision despite the decreased accuracy of the extra point in the wake of the 2015 rule change.

What's Next: We plan to breakdown both strategies for each NFL team and decide if we can draw meaningful conclusions on a team-by-team basis.

Basketball Project

Details coming soon.

Soccer Project

Details coming soon.

Past Projects

Check out what we've worked on previously.

Statsketball 2017

Jason and Graham entered and placed 1st overall in the American Statistical Association Statsketball Challenge. Their presentation of results and methodology can be found here. A copy of a Wall Street Journal article talking about this project is here.

NBA Timeout Playcalling

This group is studying which NBA coaches and teams are the most effective coming out of a timeout and looking at the success rate of play(s) following a timeout call.

Comparing Men's Tennis Careers

This group has collected data on Novak Djokovic, Roger Federer, and Rafael Nadal. They are analyzing this data to compare the career trajectories of each player.

Short Pass vs. Run

This group has collected playcall data from 1995 to 2014. They will be looking at the increased use of the short pass and bubble screen in short yardage situations. They will also be analyzing other rushing and passing trends over the time period.

Survive and Advariance

A key component to winning bracket challenges is correctly identifying which underdog teams will eliminate nationally renowned programs. Our project realizes that there is sufficient data to repeatedly simulate single games instead of simply creating models, which allows us to better quantify the uncertainty of the outcome and identify potentially powerful underdogs. We synthesized data from multiple sources and updated our simulations on a game-by-game basis. Teams were then divided into three categories: those that are both powerful and resistant to variation; those that are powerful, yet susceptible to losing in single game scenarios; and those that are underdogs, yet can rise to the occasion and win unexpectedly.


R Shiny apps are a great way to display interactive data analyses. Check out some of the ones that we've developed.

Fantasy Football Comparison

What happens when you change the scoring system?

Shot Chart Data Logger

Collect data to make your own shot chart!

March Madness Variables

Explore the correlation between variables for NCAA teams