Predicting soccer goals in near real time using computer vision
In a soccer game, fans get excited seeing a player sprint down the sideline during a counterattack or when a team is controlling the ball in the 18-yard box because those actions could lead to goals. However, it is difficult for human eyes to fully capture such fast movements, let alone predict goals. With machine learning (ML), we can incorporate more fine-grained information at the pixel level to develop a solution that predicts goals with high confidence before they happen.
Sportradar, a leading real-time sports data provider that collects and analyzes sports data, and the Amazon ML Solutions Lab collaborated to develop a computer vision-based Soccer Goal Predictor to detect exciting moments that lead to goals, thereby increasing fan engagement and helping broadcasters provide viewers an enhanced experience. Most action recognition models are used to identify events when they occur, but Amazon ML Solutions Lab developed a novel computer vision-based Soccer Goal Predictor that can predict future soccer goals 2 seconds in advance of the event.
“We deliberately threw one of the hardest possible computer vision problems at the Amazon ML Solutions Lab team to see what the art of the possible could do, and I am very impressed with the results,” says Ben Burdsall, Group CTO at Sportradar. “The team built a video action recognition model to predict future soccer goals two seconds in advance using Amazon SageMaker and demonstrated its application for match intensity tracking. This has opened doors to many new business opportunities. The implementation costs and latency of this model on our production pipeline using AWS’s infrastructure also look very encouraging. After today, I have no more skepticism about the potential of computer vision in innovating our business.”
The team used Amazon SageMaker notebook instances to create a data processing pipeline that extracted training examples from raw videos and used transfer learning to fine-tune an Inflated 3D Networks (I3D) model. The results have inspired Sportradar’s data science and innovation teams to develop new statistics to embed into their broadcast videos to enhance fan engagement.
This post explains how we applied transfer learning using an I3D model towards goal prediction and used the inference to create an intensity index to quantify the likelihood of a team scoring goals. Additionally, we discuss how we constructed a momentum index to measure the change of velocity during attacks. (Attack is a soccer term used to describe the movement of the team in possession of the ball.) With the intensity index and the momentum index, we can detect whether there is an intense moment (a moment that leads to a goal) in near-real-time using live feeds, and build products to help broadcasters engage fans during broadcasts.
Processing the data and building the model
To capture these intense moments, we translated this objective into a binary classification problem: differentiating activities that lead to goals from those that do not. The samples in the positive class (goals class) are video clips that are 2 seconds away from the goals, and the ones from the negative class are clips in which players are engaged in activities that do not lead to goals (ballsafe class). We generated 1,550 clips from 398 professional soccer matches provided by Sportradar.
A lot of action can happen in a few seconds during soccer matches, so we used short video clips to train the model. For this use case, we extracted 5-second clips. A challenge with video processing is that reading multiple video streams and extracting clips sequentially can be very time-consuming, taking several hours to complete. To speed up the clip-extraction process, we created a data pipeline using multiprocessing in an Amazon SageMaker notebook using ml.c5.18xlarge instance with 72 CPUs to parallelize the I/O-heavy clip extraction process and reduced the clip-extraction time from 12 hours to under 15 minutes.
After data processing, we built a binary classification model using the I3D model from GluonCV’s model zoo. The I3D model uses 3D convolutions to learn spatiotemporal information directly from videos. Given that we did not have a large dataset, we employed transfer learning and fine-tuned the I3D model to get well-performant video models with our own data. For more information about fine-tuning and an I3D model using GluonCV, see Fine-tuning SOTA video models on your own dataset.
Using Amazon SageMaker notebook instances, we first loaded an I3D network pre-trained on the Kinetcis400 dataset into a Jupyter notebook. We then fine-tuned this network on the data from Sportradar to find the best set of parameters, especially those specific to action recognition models (e.g., number of frames, number of segments, stride for frame sampling).
We used recall as our primary metric for model evaluation because we wanted to capture near-100% goals (positive class). The following graphs depict the confusion matrix and the precision-recall curve. It can be seen that it is difficult for the model to differentiate between the two classes when we have near-100% recall. We re-calibrated the predicted probabilities to look at the model performance for achieving 80% and 90% recall for the positive class (sequence leads to a goal) respectively.
The following table shows the precision and recall of the negative class when we fix the recall of the positive class. We can see that our model can differentiate the two classes with the new settings. When we fix the recall of the positive class at 90%, we can capture 68% of the negative class samples, and the precision is 75%.
|At 80% Goal Recall
|At 90% Goal Recall
Intensity index and momentum index
After training and validation, we selected the model that gives the best recall on the validation set. We generated inferences over three full games of video using a moving window with the predicted probabilities acting as the intensity index. To measure the change of velocity during attacks, we also generated a momentum index for the current timestamp, using the slope of the linear regression line of predicted probabilities from four previous timestamps. Finally, we used min-max normalization to scale the index between -1 and 1. Therefore, the momentum index effectively measures how the predicted goal probabilities change in the recent few seconds.
The following image illustrates the model inference using a 5-second moving window on a 40-second clip. The areas that are marked red are moments when the predicted scores are signaling intense moments. The first two red bars depict a near-goal situation, a very intense moment in the game. Ultimately, the team scored a goal at the end of that clip during the third high intensity red bar.
The meter on the left side measures the momentum index from -1 to 1, and the match intensity line chart at the bottom is the goal predictions using our model. Because a lot of action can happen in 2 seconds, the model’s high goal probability predictions are still reasonable before the shots were missed.
Watch the full video:
Model performance in production
Sportradar is investing in computer vision both through internal research, development, and external partnerships. To facilitate the rapid transition of computer vision models from the lab to production and running computer vision models at scale, Sportradar has developed a near-real-time computer vision inference pipeline using AWS services. The pipeline helps ensure that the service level agreements and low latency requirements for near-real-time computer vision workloads are met in a cost-effective way by using Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Managed Streaming for Apache Kafka (Amazon MSK), and Amazon FSx for Lustre.
When deployed and tested in Sportradar’s computer vision inference pipeline, the latency for generating an inference from the goal prediction model featured in this article was around 200 milliseconds, with a total end-to-end processing latency of around 700 milliseconds.
Sportradar collaborated with the Amazon ML Solutions Lab to develop a novel computer vision-based Soccer Goal Predictor that predicts future goals with the precision of 75% while keeping the recall at 90%. We used transfer learning to fine-tune an I3D model in Amazon SageMaker to classify attacks that lead to goals as well as activities that don’t lead to goals, and used the model inference to create a momentum index to signal the intense moments in soccer. This model has been tested in Sportradar’s computer vision pipeline, and it can achieve close to real-time inference with low latency in sub-seconds. This approach can be applied to other sports in terms of key events prediction and game intensity measurement using computer vision infrastructure without relying on sensors for data collection.
If you’d like help accelerating the use of ML in your products and services, please contact the Amazon ML Solutions Lab program.
For more information about what the Amazon ML Solutions Lab is doing in the world of sports, see AWS Sports ML page.
About the Authors
Daliana Zhen Liu a Senior Data Scientist at the Amazon ML Solutions Lab. She builds AI/ML solutions to help customers across various industries to accelerate their business. Previously, she worked at Amazon’s A/B testing platform helping retail teams make better data-driven decisions.
Andrej Bratko leads a team of data scientists and engineers at Sportradar working on machine learning solutions in areas such as sports data collection, sportsbook risk management, odds trading and sports integrity. His team is also responsible for Sportradar’s big data and analytics infrastructure supporting machine learning model development and deployment. He holds a PhD in machine learning from the University of Ljubljana, Slovenia.
Jure Preve is a Data Scientist at Sportradar working mostly on risk management in sports betting. Besides his primary focus, he has a wide interest in different applications of machine learning and enjoys working on state-of-the-art solutions to solve complex business problems.
Luka Pataky leads the Innovation Team at Sportradar – pioneering new technologies and products. One of the key innovation projects is computer vision, where he leads a team of data and computer vision engineers working on solutions to collect sports data and gather deeper insight into games. His team also works on other projects related to applying emerging technologies to products and processes, as well as projects that look at reinventing the way of how fans engage with sports and data.
Mehdi Noori is a Data Scientist at the Amazon ML Solutions Lab, where he works with customers across various verticals, and helps them to accelerate their cloud migration journey, and to solve their ML problems using state-of-the-art solutions and technologies.
Suchitra Sathyanarayana is a manager at Amazon ML Solutions Lab, where she helps AWS customers across different industry verticals accelerate their AI and cloud adoption. She holds a PhD in Computer Vision from Nanyang Technological University, Singapore.
Uros Lipovsek is machine learning engineer with experience in ML, computer vision, data engineering and devops. He is architecting computer vision pipeline at Sportradar and holds B.S. in Economics with focus on Econometrics from University of Ljubljana.