The Liverpool Thread

Scotty98TR · Feb 21, 2014

Athe~ said:
He gave the player a suspended yellow.

Surely you wouldn't actually show the yellow card to him if you were 'suspending it' (Not entirely sure you can actually do that), and just tell him that one more foul and he is off.

Athe~ · Feb 21, 2014

Scotty98TR said:
Surely you wouldn't actually show the yellow card to him if you were 'suspending it' (Not entirely sure you can actually do that), and just tell him that one more foul and he is off.

Surely you didn't think I was being serious?

Scotty98TR · Feb 21, 2014

Athe~ said:
Surely you didn't think I was being serious?

There is no such thing as sarcasm on the internet. I did realise after I had written that last comment that you were probably joking, but it was too late.

Now I look like a ***.

GodCubed · Feb 23, 2014

Re the stats Howard Webb debate earlier. It's been covered in rather ball-shriveling detail by a bloke on reddit. I grabbed the now badly-formatted text and chucked it in the spoiler below if you don't want to read it on there, if you want to read it at all.

**Wall of text below**

**TL;DR - The stats, for the most part, fail to support the claim that Howard Webb bottles big decisions.**

**Background:** [A Telegraph article](The stats that prove Howard Webb bottles big decisions - Telegraph) claimed that Howard Webb bottles big decisions and that statistics proved this. I wanted to look at whether the statistics actually support the claims made in the article.

**Setup:** I'll start by explaining the methodology for examining the claim that Howard Webb gives more home penalties than the average PL referee. The same methodology was used to examine the other claims in the article. The results and verdict are summarized below, along with a number of caveats.
Howard Webb gave 23 penalties, of which 15 were to the home team (65%)
All other PL referees gave 337 penalties, of which 203 were to the home team (60%)
Does Howard Webb have a home bias in giving penalties compared to the average referee?

**Methodology:** Consider the issue of giving penalties to home/away teams. Suppose referees handle that in the following way. They each have coins that have a certain chance p (between 0 and 100%) of coming up heads, so that if you flipped the coin infinitely many times, the fraction of tosses that came up heads would be very close to p. Every time the referee has to give a penalty, he tosses the coin. If it comes up head, he gives the penalty to the home team else to the away team.

To check if Webb is different from other referees (or from pre-World-Cup Webb himself), we assume that all other PL referees are given coins with the same head probability p. So the question is now the same as asking whether Webb's coin has a different head probability than everyone else's. We don't see the head probability for anyone's coins, but we do see the results of their coin flips and can estimate p from it. In our current example, we see that Webb's coin produced 15 heads (home penalties) when he flipped it 23 times (total penalties). For the other PL referees, when the coin was flipped 337 times, it came up heads 203 times. Since we've assumed all other referees are identical, we can assume that a single referee gave 337 penalties of which 203 were given to the home team. If we imagine a string of H's (heads) and T's (tails) that is 337 characters long, it contained 203 H's (about 60%). This gives us an estimate of the head probability p for the average PL referee to be 60% (p=0.6).

But if we examined a small set of coin flips from this long string, we have no guarantees that it will contain 60% H's. In the extreme case, if we only examined 1 character, it will be either an H (100% H's) or a T (0% H's). Then our estimate of p would be either 1 or 0. In Webb's string of coin tosses, our estimate of p is 65% (15/23), which seems larger than the average PL referee's 60%. However, this was estimated from only 23 coin flips and could therefore be different from the correct value.

To avoid this problem, we approach the question differently. Instead, suppose that Webb was just an average PL referee. Then his coin would be identical to everyone's and you could think that if Webb were allowed to make 337 penalty calls, 203 penalties would be given to the home team. If we had 337 penalty calls Webb had made as an average referee, we could choose a random subset of 23 calls and see how many of them were given to the home team (heads). From our 60% estimate of p, we expect the number of heads to be close to 14 (23*0.6=13.8). If we did this many times, we could plot a histogram of the number of heads we get in 23 tosses and we'd see some variability relative to 14. [The histogram here](imgur: the simple image sharer) shows exactly that with the tossing repeated 100,000 times. In fact, we see that a large number of times (about 40% of the time), the number of heads is not only larger than 14, but also larger than 15 (which we think might be a high number of home penalties to give out of 23). So, about 40% of the time, if you asked the average PL referee to make 23 penalty calls, he would make 15 or more home penalty calls. This suggests that Howard Webb's home penalty calls don't indicate any systematic bias towards home teams, just some sampling noise.

**Theoretical basis:** The framework used above is called hypothesis testing in statistics. It assumes that we have two hypotheses, a null or default hypothesis and an alternative hypothesis. We wish to reject (or fail to reject) the null hypothesis using observed data. Here, the "null" or default hypothesis is that Howard Webb's coin is the same as the average PL referee's coin. The "alternative" hypothesis that we want to test is that Howard Webb's coin comes down heads more often than the average PL ref's does. The test statistic (or quantity we observe) is the number of heads and tails Howard Webb's and the average PL ref's coin tosses produce.

Hypothesis testing recommends that in order to be able to reject the null hypothesis:

* You assume that the null hypothesis is true (i.e, Howard Webb's coin is the same as the avg. referee's so that if Webb tossed the coin 337 times, it would come up heads 203 times).
* Calculate the chance that if the data were produced under that assumption, it looks just as bad or worse than the test statistic (i.e, a set of 23 tosses chosen out of the 337 produces 15 or more heads, so that the observed data indicates an equal or larger home bias).
* If this chance is less than 5%, you claim that the null hypothesis can be rejected. If it is larger than 5%, you fail to reject the null hypothesis. You can make the threshold smaller than 5% if you want to reduce the possibility of rejecting the null hypothesis just by chance.

**Results:**

1. Of 23 penalties Webb gave 15 were to the home team - IS THAT TOO MANY?

Probability that the avg PL referee gives >= 15 home penalties out of 23 = 0.3973578

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

2. Of 24 penalties Webb gave 4 were given in the last 15 min- IS THAT TOO FEW?

Probability that the avg PL referee gives <= 4 late penalties out of 24 = 0.2872285

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

3. Of 11 penalties Webb gave 1 were crucial- IS THAT TOO FEW?

Probability that the avg referee gives <= 1 crucial penalties out of 11 = 0.002772367

**REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF WRT CRUCIAL PENALTIES**

4. Of 23 penalties Webb gave 15 were to the home team - IS THAT TOO MANY?

Probability that pre-WC Webb gives >= 15 home penalties out of 23 = 0.280075

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

5. Of 24 penalties Webb gave 4 were given in the last 15 min- IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 4 late penalties out of 24 = 0.3232952

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

6. Of 11 penalties Webb gave 1 were crucial- IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 1 crucial penalties out of 11 = 0.001276239

**REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB WRT CRUCIAL PENALTIES**

7. Of 22 red cards post-WC Webb gave 8 were given to the home team - IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 8 home red cards out of 22 = 0.5278146

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

Probability that the avg PL referee gives <= 8 home red cards out of 22 = 0.4431766

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

8. Of 22 red cards post-WC Webb gave 1 were given with a penalty - IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 1 red cards with penalty out of 22 = 0.1309911

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

Probability that the avg PL referee gives <= 1 red cards with penalty out of 22 = 0.2079409

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

9. Of 65 red cards Webb gave 0 were given early - IS THAT TOO FEW?

Probability that the avg PL referee gives <= 0 red cards early out of 65 = 0.05881382

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

**Verdict:** Of the 11 statistical tests, only 2 reject the null hypothesis that post-WC-Webb is similar to the average PL ref or pre-WC-Webb. Both of them are related to crucial penalties, which are penalties in the last half hour of a match that had the potential to change the result of a match. So there is probably some truth to the claim that Howard Webb avoids making game-changing penalty decisions late in the game, but little to no evidence (based on this data) for any of the other claims.

**Caveats:** A number of caveats apply here because I can only access the data in the article and not the underlying source.

* In reality, all other PL refs are not likely to be identical. Some referees probably have p larger than 60% and some smaller.
* The data may contain some outlier refs. The outlier sets might be different for different questions.
* The coin flipping model makes assumptions about the variance of the underlying probability distribution. Real data often has higher variance. In a more general sense, I have not tested how appropriate this model is for these scenario.
* Hypothesis testing only allows us to reject or fail to reject the null hypothesis. I have shown here that even if Howard Webb was indeed an average PL ref, there a 5% chance that I will reject that hypothesis by looking at his decisions. I have not shown what chance I have (ideally would like it to be 100%) of rejecting the hypothesis of similarity if Howard Webb is in fact different from the average PL ref.

Long story short, the statistics are nowhere near as watertight as they were presented. I think the analysis in that post is slightly flawed too, but far less so than the Telegraph's original article.

TJD07 · Feb 23, 2014

GodCubed said:
Re the stats Howard Webb debate earlier. It's been covered in rather ball-shriveling detail by a bloke on reddit. I grabbed the now badly-formatted text and chucked it in the spoiler below if you don't want to read it on there, if you want to read it at all.

**Wall of text below**

**TL;DR - The stats, for the most part, fail to support the claim that Howard Webb bottles big decisions.**

**Background:** [A Telegraph article](The stats that prove Howard Webb bottles big decisions - Telegraph) claimed that Howard Webb bottles big decisions and that statistics proved this. I wanted to look at whether the statistics actually support the claims made in the article.

**Setup:** I'll start by explaining the methodology for examining the claim that Howard Webb gives more home penalties than the average PL referee. The same methodology was used to examine the other claims in the article. The results and verdict are summarized below, along with a number of caveats.
Howard Webb gave 23 penalties, of which 15 were to the home team (65%)
All other PL referees gave 337 penalties, of which 203 were to the home team (60%)
Does Howard Webb have a home bias in giving penalties compared to the average referee?

**Methodology:** Consider the issue of giving penalties to home/away teams. Suppose referees handle that in the following way. They each have coins that have a certain chance p (between 0 and 100%) of coming up heads, so that if you flipped the coin infinitely many times, the fraction of tosses that came up heads would be very close to p. Every time the referee has to give a penalty, he tosses the coin. If it comes up head, he gives the penalty to the home team else to the away team.

To check if Webb is different from other referees (or from pre-World-Cup Webb himself), we assume that all other PL referees are given coins with the same head probability p. So the question is now the same as asking whether Webb's coin has a different head probability than everyone else's. We don't see the head probability for anyone's coins, but we do see the results of their coin flips and can estimate p from it. In our current example, we see that Webb's coin produced 15 heads (home penalties) when he flipped it 23 times (total penalties). For the other PL referees, when the coin was flipped 337 times, it came up heads 203 times. Since we've assumed all other referees are identical, we can assume that a single referee gave 337 penalties of which 203 were given to the home team. If we imagine a string of H's (heads) and T's (tails) that is 337 characters long, it contained 203 H's (about 60%). This gives us an estimate of the head probability p for the average PL referee to be 60% (p=0.6).

But if we examined a small set of coin flips from this long string, we have no guarantees that it will contain 60% H's. In the extreme case, if we only examined 1 character, it will be either an H (100% H's) or a T (0% H's). Then our estimate of p would be either 1 or 0. In Webb's string of coin tosses, our estimate of p is 65% (15/23), which seems larger than the average PL referee's 60%. However, this was estimated from only 23 coin flips and could therefore be different from the correct value.

To avoid this problem, we approach the question differently. Instead, suppose that Webb was just an average PL referee. Then his coin would be identical to everyone's and you could think that if Webb were allowed to make 337 penalty calls, 203 penalties would be given to the home team. If we had 337 penalty calls Webb had made as an average referee, we could choose a random subset of 23 calls and see how many of them were given to the home team (heads). From our 60% estimate of p, we expect the number of heads to be close to 14 (23*0.6=13.8). If we did this many times, we could plot a histogram of the number of heads we get in 23 tosses and we'd see some variability relative to 14. [The histogram here](imgur: the simple image sharer) shows exactly that with the tossing repeated 100,000 times. In fact, we see that a large number of times (about 40% of the time), the number of heads is not only larger than 14, but also larger than 15 (which we think might be a high number of home penalties to give out of 23). So, about 40% of the time, if you asked the average PL referee to make 23 penalty calls, he would make 15 or more home penalty calls. This suggests that Howard Webb's home penalty calls don't indicate any systematic bias towards home teams, just some sampling noise.

**Theoretical basis:** The framework used above is called hypothesis testing in statistics. It assumes that we have two hypotheses, a null or default hypothesis and an alternative hypothesis. We wish to reject (or fail to reject) the null hypothesis using observed data. Here, the "null" or default hypothesis is that Howard Webb's coin is the same as the average PL referee's coin. The "alternative" hypothesis that we want to test is that Howard Webb's coin comes down heads more often than the average PL ref's does. The test statistic (or quantity we observe) is the number of heads and tails Howard Webb's and the average PL ref's coin tosses produce.

Hypothesis testing recommends that in order to be able to reject the null hypothesis:

* You assume that the null hypothesis is true (i.e, Howard Webb's coin is the same as the avg. referee's so that if Webb tossed the coin 337 times, it would come up heads 203 times).
* Calculate the chance that if the data were produced under that assumption, it looks just as bad or worse than the test statistic (i.e, a set of 23 tosses chosen out of the 337 produces 15 or more heads, so that the observed data indicates an equal or larger home bias).
* If this chance is less than 5%, you claim that the null hypothesis can be rejected. If it is larger than 5%, you fail to reject the null hypothesis. You can make the threshold smaller than 5% if you want to reduce the possibility of rejecting the null hypothesis just by chance.

**Results:**

1. Of 23 penalties Webb gave 15 were to the home team - IS THAT TOO MANY?

Probability that the avg PL referee gives >= 15 home penalties out of 23 = 0.3973578

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

2. Of 24 penalties Webb gave 4 were given in the last 15 min- IS THAT TOO FEW?

Probability that the avg PL referee gives <= 4 late penalties out of 24 = 0.2872285

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

3. Of 11 penalties Webb gave 1 were crucial- IS THAT TOO FEW?

Probability that the avg referee gives <= 1 crucial penalties out of 11 = 0.002772367

**REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF WRT CRUCIAL PENALTIES**

4. Of 23 penalties Webb gave 15 were to the home team - IS THAT TOO MANY?

Probability that pre-WC Webb gives >= 15 home penalties out of 23 = 0.280075

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

5. Of 24 penalties Webb gave 4 were given in the last 15 min- IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 4 late penalties out of 24 = 0.3232952

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

6. Of 11 penalties Webb gave 1 were crucial- IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 1 crucial penalties out of 11 = 0.001276239

**REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB WRT CRUCIAL PENALTIES**

7. Of 22 red cards post-WC Webb gave 8 were given to the home team - IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 8 home red cards out of 22 = 0.5278146

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

Probability that the avg PL referee gives <= 8 home red cards out of 22 = 0.4431766

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

8. Of 22 red cards post-WC Webb gave 1 were given with a penalty - IS THAT TOO FEW?

Probability that pre-WC Webb gives <= 1 red cards with penalty out of 22 = 0.1309911

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS PRE-WC-WEBB*

Probability that the avg PL referee gives <= 1 red cards with penalty out of 22 = 0.2079409

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

9. Of 65 red cards Webb gave 0 were given early - IS THAT TOO FEW?

Probability that the avg PL referee gives <= 0 red cards early out of 65 = 0.05881382

*FAIL TO REJECT HYPOTHESIS THAT POST-WC-WEBB IS THE SAME AS THE AVG PL REF*

**Verdict:** Of the 11 statistical tests, only 2 reject the null hypothesis that post-WC-Webb is similar to the average PL ref or pre-WC-Webb. Both of them are related to crucial penalties, which are penalties in the last half hour of a match that had the potential to change the result of a match. So there is probably some truth to the claim that Howard Webb avoids making game-changing penalty decisions late in the game, but little to no evidence (based on this data) for any of the other claims.

**Caveats:** A number of caveats apply here because I can only access the data in the article and not the underlying source.

* In reality, all other PL refs are not likely to be identical. Some referees probably have p larger than 60% and some smaller.
* The data may contain some outlier refs. The outlier sets might be different for different questions.
* The coin flipping model makes assumptions about the variance of the underlying probability distribution. Real data often has higher variance. In a more general sense, I have not tested how appropriate this model is for these scenario.
* Hypothesis testing only allows us to reject or fail to reject the null hypothesis. I have shown here that even if Howard Webb was indeed an average PL ref, there a 5% chance that I will reject that hypothesis by looking at his decisions. I have not shown what chance I have (ideally would like it to be 100%) of rejecting the hypothesis of similarity if Howard Webb is in fact different from the average PL ref.

Long story short, the statistics are nowhere near as watertight as they were presented. I think the analysis in that post is slightly flawed too, but far less so than the Telegraph's original article.

That has the be the most painfully detailed and long winded way of saying his sample size is too small to compare accurately to the average among all refs but is within a reasonable margin.

iamauser · Feb 23, 2014

What a quality ball from Sterling. He is getting better and better with every game

Tezzz · Feb 23, 2014

Well then.

TJD07 · Feb 23, 2014

Allen made a massive difference when he came on.

Henderson <3

GodCubed · Feb 23, 2014

TJD07 said:
That has the be the most painfully detailed and long winded way of saying his sample size is too small to compare accurately to the average among all refs but is within a reasonable margin.

Yeah, pretty much! Still, he showed his working, so extra credits to him.

TJD07 said:
Allen made a massive difference when he came on.

Henderson <3

Henderson's been quality this season, but today was a whole new level.

Athe~ · Feb 24, 2014

I remember Henderson when he first joined and he was a massive disappointment, but he's really stepped up his game.

Tezzz · Feb 24, 2014

Yeah Henderson was shocking when he joined, similar to Lucas. Both have become brilliant players. Shame about Leiva's injuries lately though. Already decided that I'm getting Henderson on my the back of my shirt next season.

Mike. · Feb 24, 2014

Tezzz said:
Yeah Henderson was shocking when he joined, similar to Lucas. Both have become brilliant players. Shame about Leiva's injuries lately though. Already decided that I'm getting Henderson on my the back of my shirt next season.

Henderson has come a long long way. I thought Liverpool needed to back him when he arrived, but I honestly didn't think he'd be this good. Must be on the plane to Brazil, ditto Sterling.

On the other hand, your defending was awful. Midfield wide, this is why Gerrard cannot be the "true holder". Harsh on Henderson, but Carrick would have to start for England (because I assume RH will play Gerrard every time).

TJD07 · Feb 24, 2014

Currently scored the most headed goals in the league, with the least amount of crosses in the league <)

Tezzz · Feb 24, 2014

DS MONTAGE - Video Dailymotion

GodCubed · Feb 24, 2014

Mike. said:
Henderson has come a long long way. I thought Liverpool needed to back him when he arrived, but I honestly didn't think he'd be this good. Must be on the plane to Brazil, ditto Sterling.

On the other hand, your defending was awful. Midfield wide, this is why Gerrard cannot be the "true holder". Harsh on Henderson, but Carrick would have to start for England (because I assume RH will play Gerrard every time).

On the other hand, Gerrard's adapted wonderfully. He's not a true holder, but he's definitely adapted to the role enough to be a 'half-holder' or a distributor alongside someone who actually is defensive minded like Carrick. He's surprised me with how good and sensible he's been.

Mike. · Feb 24, 2014

GodCubed said:
On the other hand, Gerrard's adapted wonderfully. He's not a true holder, but he's definitely adapted to the role enough to be a 'half-holder' or a distributor alongside someone who actually is defensive minded like Carrick. He's surprised me with how good and sensible he's been.

Oh absolutely, he's been class. A Carrick - Gerrard - Henderson 4-3-3 could be in order against Italy.

Ride The Walrus · Feb 24, 2014

Johnson was a bit better than how he was before his injury, so that's promising. Him and Flanno, Sakho are definite starters right now. I'd imagine Skrtel is too, as much as I don't want him to be

TheFalse9 · Feb 25, 2014

Ride The Walrus said:
Johnson was a bit better than how he was before his injury, so that's promising. Him and Flanno, Sakho are definite starters right now. I'd imagine Skrtel is too, as much as I don't want him to be

I'd personally prefer Agger to Skrtel, but then again, I'm not Brendan Rodgers.

TJD07 · Feb 25, 2014

TheFalse9 said:
I'd personally prefer Agger to Skrtel, but then again, I'm not Brendan Rodgers.

Agger tends to get bullied easy by big forwards, I'd take Sakho over Agger which is I'd expect to see one of Agger/Skrtel to leave.

TheFalse9 · Feb 25, 2014

TJD07 said:
Agger tends to get bullied easy by big forwards, I'd take Sakho over Agger which is I'd expect to see one of Agger/Skrtel to leave.

Agger - Sakho could be a good partnership, though, since it would mean we have a passer and a big guy at centre-back. Sakho could cover the big forwards, Agger can sweep up behind him and distribute the ball once we're in possession.

The Liverpool Thread

Member

Member

Member

Mod-ern Day Legend

Member

Moderator

Banned

Member

Mod-ern Day Legend

Member

Banned

Member

Member

Banned

Mod-ern Day Legend

Member

A pretty cool guy.

Member

Member

Member

FMBase

Beta

Resources