Brownian Notion: February 2019

Saturday, 9 February 2019

Called it right!

Liverpool 3, Bournemouth 0! Well done boys!

Its a little bit unfair when forecasters take credit for "getting it right" when they make a probability-based forecast. Not sure if all my odds were right, but the maximum likelihood option did come in!

Liverpool v Bournemouth Today: Analytics and Form Heatmaps

I have been running some analysis on today's games. Here are some numbers for one of the crucial ones - Liverpool v Bournemouth. I've developed a "heatmap" to show the form of both teams!

Here is a heatmap for Livepool. Attcking from increases left to right, while defensive form increases up the page. Believe me, this shows phenomenal form!

Compare this with Bournemouth. They are clearly less good!!

So, here is my odds estimate!

x:0 x:1 x:2 x:3 x:4 x:5 x:6 x:7 x:8 x:9
0:y 57/1 90/1 367/1 1723/1 8332/1 49999/1 --- --- --- ---
1:y 15/1 28/1 90/1 426/1 2272/1 16666/1 49999/1 --- --- ---
2:y 9/1 16/1 58/1 234/1 1999/1 49999/1 --- --- --- ---
3:y 8/1 13/1 45/1 229/1 1922/1 16666/1 49999/1 --- --- ---
4:y 9/1 15/1 58/1 242/1 2499/1 16666/1 --- --- --- ---
5:y 13/1 22/1 78/1 367/1 2173/1 16666/1 --- --- --- ---
6:y 22/1 39/1 123/1 609/1 4999/1 24999/1 --- --- --- ---
7:y 47/1 75/1 235/1 1281/1 8332/1 --- --- --- --- ---
8:y 105/1 179/1 514/1 7142/1 24999/1 --- --- --- --- ---
9:y 164/1 275/1 745/1 4999/1 49999/1 49999/1 --- --- --- ---

summary: Home win, 1/8; Away win, 29/1; Score draw 17/1; No score draw, 57/1.

Looks like the most likely result (by a whisker) is 3 - nil to Liverpool.

A reminder, this is a new method and still needs to be validated, so use with caution.

Wednesday, 6 February 2019

Everton v Man City Tonight!

So, here is my first odds forecast, for tonight's prem game...

Here is a grid of the odds for all score combinations up to 9 apiece...

x:0 x:1 x:2 x:3 x:4 x:5 x:6 x:7 x:8 x:9

0:y 19/1 8/1 10/1 13/1 32/1 74/1 207/1 768/1 2499/1 9999/1

1:y 17/1 9/1 9/1 15/1 28/1 70/1 226/1 832/1 1999/1 9999/1

2:y 42/1 19/1 19/1 31/1 68/1 146/1 285/1 1666/1 - 9999/1

3:y 151/1 59/1 61/1 92/1 166/1 369/1 1110/1 - 9999/1 -

4:y 344/1 262/1 178/1 434/1 999/1 1999/1 3332/1 4999/1 - -

5:y 2499/1 1110/1 908/1 1666/1 3332/1 4999/1 - - - -

6:y 4999/1 4999/1 - 9999/1 - - - - - -

7:y 9999/1 - - - - - - - - -

8:y - - - - - - - - - -

9:y - - - - - - - - - -

And here are some summary odds:

Home win: '44/10',

Away win: '1/2',

Score draw: '5/1',

No score draw '19/1'

These are based on 10,000 modeled games and 5000 particles per team. Interested to know if there are other odds people would be interested in.

I compared with Bet365 odds, which are generally fairly similar. My numbers seem to like the idea of a home win slightly more than Bet365. 1 nil to Everton looks like a value bet (although odds are long).

Health warning: I am still validating this model, although I believe the approach is generally solid!

*** Update*** the game finished 0-2 (win for Man City). It stood at 0-1 until injury time, which would have tallied with my most likely result

Sunday, 3 February 2019

Back to Bayes-ics

As explained last post, to do our football analytics, what we need are some input parameters about how "good" the two teams facing each other in a match are likely to be, on the day. There are two alternative approaches to doing this. One is based on classical statistics. To follow this approach you look back over a load of matches and work out an average scoring rate and an average rate of conceding. You can also estimate, on average, how much better the team performs at home. This approach has some weaknesses though. A team can get better or worse over the season; Its no good at telling how good the team will be today. It also requires quite a lot of data (a lot of matches) and assumes all the teams form are stable over time. In other words, it makes some assumptions that are not true. Which is never good.
A much better approach is to use Bayesian statistics. Thomas Bayes was a statistician with a keen interest in games of chance. Hence his work is very relevant to all sorts of gambling! The formulas he gave us are all about inferring the underlying truth from a series of observations. Each observation modifies our belief in a given hypothesis. To cut a long story short, bayesian inference crops up everywhere, in modern analytics.
The particular method I am deploying for football match analysis is the Particle Filter - a modern development, based entirely on Bayesian inference. You can find a pretty good intro to particle filters in this slide deck. Note the reference to football results analysis on slide 24... Using a PF for football analysis is s a nice party trick that often crops up in tutorial material, although I do it in a slightly more sophisticated way to the standard approach.
Applying a particle filter to the English Premier League works like this:

Each team is represented by a large number of "particles", each of which is a guess at the "model" - i.e. the qualities of the team (its attacking strength, defensive strength etc.)

Between fixtures, we "advance" these models, saying in effect. Last week the team was like this, so this week, how might the team have moved on
After a fixture, we "filter" the particles, preferentially keeping those that best explain the result. Incidentally, this is where Bayes comes in. His theorem says that instead of asking the hard question, "how good is my particle (model), given the result", we can ask "how likely is my result, given my model". This turns out to be an easier question and one we can answer. Importantly we consider not just the result but the capabilities of the teams involved. Hence all the analysis is interconnected.
Now, when two teams face off, we have a set of guesses about the teams capabilities at the present time that is based on all previous results, especially the last result. We can model the game considering the full range of guesses and get the best possible odds prediction, given the evidence.

In a nutshell, that's it. My plan now is to publish some predictions before the weekend fixtures and try to ascertain if we can beat the bookies!. That's my goal. Bookies are there to be beaten after all.

Saturday, 2 February 2019

Monte Carlo Football Analytics - Project Monaco

I'm going to christen this effort Project Monaco. It's about Monte Carlo and Football, so it's a no brainer, right?

I'll explain the basics of the analysis...

Consider two football teams, facing each other. The result is based on a few things. How good is the home team at attacking? Conversely, how good is the away team at defending? These two things will help determine an average rate of goal scoring for the home team. There is another factor in there - the home team advantage. Some teams perform better at home than away. Others do not, and some teams even do a little better away from home (home fans can be off-putting if they're not getting behind the team). On the flip side, how good is the away side at attacking, and how well can the home team defend.
In my method, I put these numbers into a pot, and work out an average expected rate of goal scoring for each team, in the context of them playing each other. I then create a computer model of the fixture, and run about ten thousand trial games, recording the result of each. What I get is a comprehensive odds forecast, covering every score permutation.
The computer model uses a classic statistical method to model the results of each trial game. The binomial distribution. Its hard to argue with the basics of this. The one hard part we are left with is establishing the input data for the match, regarding the strengths and weakenesses of each team. The truth is, we can't be certain about them, and that leads to some complications. What we need to do is consider a range of possibilities regarding this input data, which makes things a little mroe complicated. I will explain how we work out these inputs based on the league results in the next post!

Context Switch - Monte Carlo Football Analytics

It's been a year since m,y last post on Monte Carlo modelling of horse races. The reason for the gap is that I realised, through quite a bit of experimentation, that my methods for predicting the "true odds" of a horse race really weren't fit for purpose. A lot of the time, my odds predictions were quite similar to those from the bookies. But the cases where I predicted a horse should have much shorter odds (i.e. the "value bets" did not win as often as I expected. Essentially there was very little profit to be had. None, in fact, beyond random outbreaks of good luck.
I concluded that, for horse racing, there was a lot of information out there that helps understand how well a horse is likely to perform. Maybe a lot of it is on "back channels" known only to the inner racing community. But, I concluded, at a minimum, you really needed to look back at a horse's history and see how it fared against each horse it raced, and to know how good each of these other horses were. That, I concluded, was too much like hard work (at least for now). Too much bespoke web scraping to be written for one thing, and life is too short. What I did do, is dream up an analysis method that could genuinely work. Its based on established methods, although it has elements that I don't believe have been tried before. But I decided it would be much easier to operate this on a more restricted field of runners. Like a football league, where a small set of clubs face each other in a very well defined, exhaustive set of fixtures. The Premier League, will be my case study!
The good news is, having tried this, I KNOW I have a method that is at least pretty cool. As I planned to do with the horse racing, I will provide more details about the method, and some of my odds predictions for upcoming games. More posts to follow!

Brownian Notion