Ever wondered how Las Vegas and online sportsbooks come up with their game betting-lines? Or how sports analytics websites figure out their win-total forecasts? Or even how professional sports teams know which players will most help them win?
A lot of different factors play into each inquiry. Many of them are basic. Sportsbooks, analysts, and teams use surface-level numbers—public data we all have access to.
Most of them, though, take it one step further by turning that information into predictive models. This is how they come up with betting lines: money lines, spreads, the over/under, win totals and even odds on which teams or, individual sports, players have the best shot at winning the title.
Pretty much all of these predictive models are proprietary, which means that we don’t have unfettered access to how they are built or implemented. Vegas isn’t about to reveal the most profitable part of its enterprise.
Still, we have a general idea of what goes into computer-generated sports predicts.
For the most part, these models are combining AI and machine learning with public football datasets to deliver their predictions. Data used includes: historical results of games, results from more recent games, specific player performances, opposing-player performances, injury information, advanced team data (think: stats about the entire team rather than the player), location data (home/away – remember there is a home team advantage to factor into things too!) and other relevant factors along these lines.
How much a computer weights each variable in the model depends on the algorithm that’s been programmed.
“Different stats mean different things to different algorithms and model developers. That’s why so many computer models vary in predictions from one another. They’re rarely the same or even remotely identical, due to following different logic” – Daniel Gaillard of football predictions website Forecastr.com.
This is all complicated enough, but it gets trickier. The goal of these computer models is to use machine learning to predict outcomes. No, this does not mean the computer is alive. Instead, machine learning refers to a computer recognizing trends and patterns in all the data fed into its interface so that it can adjust its calculations and predictions on the fly.
Whenever you see a team’s game odds line shift or a championship future move, machine learning is at play. It also factors into the initial predictions.
Prediction services are simply computers relying on historical data they have stored, from decades and centuries past, to predict what will happen tomorrow based on trends and patterns they’ve observed in those historical results.
Now, let’s take a deeper look into how these algorithms actually work.
Although each algorithm is still in early development, even though it’s been around for quite some time now, they are still making a difference in how data is gathered, used, and translated into odds.
With all that said, their effectiveness is both brilliant and cunning. The thing with these algorithms is that they are quite accurate but quite wrong.
This all has to do with the betting side of predicting results. Namely, sometimes the algorithm will give a certain team a higher probability of winning. Most games are swayed towards the favorite, but that’s not necessarily the case in real-time.
These mathematical algorithms are quite wrong, sometimes. A team could be well and truly much stronger than its opponent for that particular game, only to lose spectacularly in a magnificent underdog win.
I guess the bottom line of the major flaw of these algorithms is that they cannot predict the human factor that goes into every sports game.
For example, a team could be neck and neck with another team fighting for the title in their respective league. One of the teams loses their next match, giving the other team a significant chance of surpassing them. If the other team manages to win its next match, then they have a significant mental advantage, as well as a point’s advantage.
So, when the next game week comes, the team that lost the previous match could also lose the next one, due to the mental blockade. But, they will still be mathematically favorites to win. And this is where the algorithm gets it wrong; predicting the human factor.
As we said, this is only but one flaw of these algorithms. So what could be the other flaws?
Well, we also said that these algorithms take into account various other factors such as home advantage, but it doesn’t take into account the entire picture.
For example, if an NFL player has suffered an injury, and has recovered, then this information is processed very differently between computers and humans.
Let’s also take into account that this player is a star for their team, and has suffered an injury that has kept him away from the game for a whole month. Now, you might think it’s great that he’s coming back after a month but in reality, it takes time for the player to get up to full speed.
So, the algorithm will calculate that the player is back, but it won’t take into account the full rehabilitation process. The player might be fit, but maybe he can’t play the full game. The team in question is still considered to be handicapped, but the computer doesn’t.
So, How Are Algorithms Being Applied?
Okay, so onto the important stuff, how are they actually being applied? Well, it’s been widely considered that European football (soccer) is the most easily predictable sport out there if we take into account the odds.
How each sports algorithm maker works is that they program the AI to “watch” and absorb the information from a lot of games in the past. The AI then has a general idea of how each team generally performs, how each player performs, and how each team performs against its competitors.
As you can imagine this is incredibly complicated stuff that’s very hard to even comprehend. But, as we said, it’s all done through AI and machine learning. The computer is much more powerful in calculating than the human brain, and the computer is exceptional in analyzing, calculating, and determining the odds for each sports team based on tremendous amounts of data.