xPoints

Calculate the expected points yourself with the worldfootballR package
Published

October 18, 2022

INTRO

Now that expected goals get more common and accepted, more people are realising that it is mostly for rating a teams performance over more matches. Because some matches will skew the xG of a team, we can use expected points to see how well they perform against there underlying numbers. To read more about expected points an how we calculate it, you can read this article from me. In short: we look at the size of the chances to see how much points a team deserved in a match instead of only looking at the total xG.

LOADING THE PACKAGES

Make sure you have the newest version of worldfootballR.

# devtools::install_github("JaseZiv/worldfootballR")
library(tidyverse)
library(worldfootballR)

GETTING THE MATCH DATA

As we are simulating every shot in every match multiple times, we need to load all the shots from the competition we want to use.

#get match id's
league_matches <- fotmob_get_league_matches(
  country =     "ENG",
  league_name = "Championship"
) 

#filter out matches in the future
league_matches <- league_matches %>%
  filter(league_matches$status$reason$short == "FT") 

#get matches for those id's
match_details <- fotmob_get_match_details(league_matches$id)
# unnest the shots only needed in older versions of worldfootballR so commented it out) 
# and add team name of team shooting and team conceding (
shots_temp <- match_details %>%
  #unnest(shots) %>%
  mutate(team_name = case_when(team_id == home_team_id ~ home_team,
                              team_id == away_team_id ~ away_team),
         opponent = case_when(team_id == home_team_id ~ away_team,
                              team_id == away_team_id ~ home_team))

If you want all the leagues and their id’s/names/countries, read the csv from github

leagues <- read.csv("https://raw.githubusercontent.com/JaseZiv/worldfootballR_data/master/raw-data/fotmob-leagues/all_leagues.csv")
head(leagues)
  ccode       country   id                 name
1   INT International   42     Champions League
2   INT International   73        Europa League
3   INT International  525 AFC Champions League
4   INT International 9469              AFC Cup
5   INT International 9841              AFC U19
6   INT International 9265       AFF Suzuki Cup
                                    page_url
1      /leagues/42/overview/champions-league
2         /leagues/73/overview/europa-league
3 /leagues/525/overview/afc-champions-league
4             /leagues/9469/overview/afc-cup
5             /leagues/9841/overview/afc-u19
6      /leagues/9265/overview/aff-suzuki-cup

View the data frame and search in it!

TABLE WITH XG PER TEAM

First we make a table with the xG for and against of every team. This is also possible through worldfootballR, but the function doesn’t work at the moment.

Function that currently doesn’t work
ere_team_xg_2022 <- fotmob_get_season_stats(
  league_id = 48, #48 is the code for the Championship
  season_name = "2022/2023",
  stat_name = "Expected goals",
  team_or_player = "team"
)
xG_table <- shots_temp %>%
  mutate(expected_goals = replace_na(expected_goals,0)) %>% #replace the NA for own goals to 0
  group_by(team_name) %>%
  summarise(xGF = sum(expected_goals)) %>% #xG for
  left_join(shots_temp %>% #join with xG against
  mutate(expected_goals = replace_na(expected_goals,0)) %>%
  group_by(opponent) %>%
  summarise(xGA = sum(expected_goals)),by=c("team_name" = "opponent"))
head(xG_table)
# A tibble: 6 × 3
  team_name          xGF   xGA
  <chr>            <dbl> <dbl>
1 Birmingham City   15.2  17.6
2 Blackburn Rovers  14.4  18.2
3 Blackpool         16.4  21.9
4 Bristol City      18.6  21.8
5 Burnley           17.5  14.0
6 Cardiff City      13.7  13.9

Nice, so we have the xG of the teams. Note that you can also filter shots_temp to exclude penalties. I keep them in as I want to make a table with the complete xG and xPoints picture.

Now for the hard part, calculating the xPoints per match. To calculate it, we will build a function to simulate every shot in the match. We do that ten thousand times and count the times every team wins/draws/loses. To get some more info, read the calculation in my old article about it.

CALCULATE XPOINTS

So first we make the function to simulate all the shots. It is an old function of mine that contains other functions as well. The easiest is to copy all the functions and just paste them in a new rscript.

the functions
# main function
calculateChance<-function(team1,team2,p){

  home = 0
  away = 0
  draw = 0
  homeP = 0
  awayP = 0
  drawP = 0
  
  for(i in 1:p){
    matchWinner <- calculateWinner(team1,team2)
    
    if(matchWinner == "home"){
      home <- home+1
      homeP <- homeP+3
    }else if(matchWinner == "away"){
      
      away <- away+1
      awayP <- awayP+3
    }else{
      draw <- draw +1
      awayP <- awayP+1
      homeP <- homeP+1
    }
  }
  
  home = paste0(home/(p/100),"%")
  away = paste0(away/(p/100),"%")
  draw = paste0(draw/(p/100),"%")
  homeP = homeP/p
  awayP = awayP/p
  
  chances <- paste0("Home win: ",home,"% | Draw: ",draw,"% | Away win: ",away,"%")
  game <- data.frame(home,draw,away,homeP,awayP)
  return(game)
}

# function that returns if a shot becomes a goal and counts the goals
testShots<-function(XG){
  Goals = 0
  XG[is.na(XG)] <- 0
  for(i in 1:length(XG)){
    if(runif(1, 0.0, 1.0)<=XG[i]){
      
      Goals <- Goals + 1
    }else{
      
    }
  }
  
  return(Goals)
}  

# function that calculates the winner by comparing the number of goals of the two teams
calculateWinner <- function(home,away){
  HomeGoals = 0
  AwayGoals = 0
  
  HomeGoals <- testShots(home)
  AwayGoals <- testShots(away)
  
  #diffTemp <- (HomeGoals - AwayGoals)
  
  #diff <- append(diff,diffTemp)
  if(HomeGoals > AwayGoals){
    
    return("home")
  }else if(AwayGoals > HomeGoals){
    
    return("away")
  }else{
    
    return("draw")
  }
}

To use this function on the data frame of all the shots we are going to wrap it in a function so we can use it easier. The 10000 is the times we simulate every match. You can put it higher, but you’ll have to wait longer for the results.

plot_func <- function(df){
  calculateChance(pull(df %>% filter(team_id == home_team_id),expected_goals),
                                      pull(df %>% filter(team_id == away_team_id),expected_goals),
                                      10000)
}

And than just use it.

df <- shots_temp %>%
  group_by(match_id) %>%
  nest() %>%
  mutate(result = map(data, plot_func)) %>%
  ungroup() %>%
  unnest(result)
head(df)
# A tibble: 6 × 7
  match_id data               home   draw   away   homeP awayP
     <int> <list>             <chr>  <chr>  <chr>  <dbl> <dbl>
1  3915324 <tibble [18 × 43]> 3.07%  32.8%  64.13% 0.420 2.25 
2  3915311 <tibble [13 × 43]> 23.4%  65.91% 10.69% 1.36  0.980
3  3915310 <tibble [22 × 43]> 18.08% 29.35% 52.57% 0.836 1.87 
4  3915302 <tibble [9 × 43]>  5.82%  68.97% 25.21% 0.864 1.45 
5  3915309 <tibble [24 × 43]> 52.59% 28.04% 19.37% 1.86  0.862
6  3915308 <tibble [26 × 43]> 55.47% 26.45% 18.08% 1.93  0.807

Let’s join this with the data frame which contains the matches and teams

total_df <- df %>%
  select(match_id,homeP,awayP) %>%
  left_join(match_details %>% 
              group_by(match_id,home_team,away_team) %>% 
              nest()) 
Joining, by = "match_id"
# if you unnested the match_details earlier, you can just join by 'match_id'

# and sum all the xPoints per team
xpoints <- total_df %>%
  group_by(home_team) %>%
  summarise(pointsH = sum(homeP)) %>%
  left_join(total_df %>%
              group_by(away_team) %>%
              summarise(pointsA = sum(awayP)),by =c("home_team"="away_team")) %>%
  mutate(xPoints = pointsH + pointsA)
head(xpoints)
# A tibble: 6 × 4
  home_team        pointsH pointsA xPoints
  <chr>              <dbl>   <dbl>   <dbl>
1 Birmingham City     9.68    8.79    18.5
2 Blackburn Rovers   10.7     7.55    18.3
3 Blackpool           7.60    8.13    15.7
4 Bristol City       10.6     7.73    18.3
5 Burnley            11.7    10.2     21.9
6 Cardiff City        9.83    7.83    17.7

GET CURRENT LEAGUE TABLE

To make our own table complete, we need just one more data frame: the actual league table. I will show you the code of the function, but this one doesn’t work for me. If you use this function, be aware the every column name has the prefix ‘table_’ and that those do not appear in my code!

worldfootballR FotMob function to get table
table <- fotmob_get_league_tables(
  country =     "ENG",
  league_name = "Championship"
)

I dove in the code to find the place where the table should be and found it, so you can copy the code beneath.

safely_from_json <- purrr::safely(jsonlite::fromJSON, otherwise = NULL, quiet = TRUE)
jsonn <- safely_from_json("https://www.fotmob.com/api/leagues?id=48") 
table <- data.frame(jsonn$result$table$data$table$all)

Note the id=48 part at the end of the url. Change that to the desired competition code that you can find in the csv mentioned earlier.

JOIN TABLES

Now we “just” have to join these three data frames together. We select the id column as well, as it makes it easy to add the club logo.

xptable <- table %>%
  left_join(xpoints, by=c("name" = "home_team")) %>%
  separate(scoresStr, c("GF", "GA"),"-") %>%
  mutate(GF = as.numeric(GF),
         GA = as.numeric(GA),
         GD = GF - GA) %>%
  
  select(idx,id, name, played, wins,draws,losses,GF,GA,GD,pts,xPoints) %>%
  arrange(-xPoints) %>%
  mutate(xRank = c(1:length(table$name))) %>%
  left_join(xG_table, by = c("name" = "team_name")) %>%
  mutate(xGD = xGF - xGA)

Let’s arrange the table in the desired output.

xptable <- xptable %>%
  select(idx,id,name,played,wins,draws,losses,GF,GA,GD,pts,xGF,xGA,xGD,xPoints,xRank) %>%
  `colnames<-`(c("RANK","id", "TEAM", "P","W","D","L","GF","GA","GD","PTS","xGF","xGA","xGD","xPTS","xRANK"))

The club logo’s can be found at the following url: https://images.fotmob.com/image_resources/logo/teamlogo/{id}.png. So we can change the column that has the id’s so that it will contain the url to the logo. This will make it super easy to add the logo to the table.

xptable <- xptable %>%
  mutate(id = glue::glue("https://images.fotmob.com/image_resources/logo/teamlogo/{id}.png"))

The best way about this is that now the whole code is fool proof for every competition thtat FotMob has xG data for. Just change the name and competition code in a few functions and you’’ll get the table WITH the correct logo’s.

MAKING THE TABLE

I use reactable for the table on my site, as it is interactive. For this tutorial I’m using gt and gtExtras as those packages work great for static tables on websites. Just use the package you like the most. Here you can find the documentation on {gt} and a lot of other packages to make tables. Just pick the one you like the most. gt let’s you use HTML and I copied some code from the creator.

library(gt)

add_rank_color <- function(col1,col2){
  add_color <- if (col1 < col2) {
    "background:#61B861;"
  } else if (col1>col2) {
    "background:#FC785F;"
  } else if (col1 == col2) {
    "background:#FDD297;"
  }
  div_out <- htmltools::div(
    style = paste(
      "width: 20px;
  height: 20px;
  border: 1px solid rgba(0, 0, 0, 0.03);
  border-radius: 50%;
text-align: center;
  align-item: right;
   margin-left: 15px;
#  color: #000;
  font-size: 13px;
      font-weight:bold;",
      add_color
    ),col1
  )
  
  as.character(div_out) %>% 
    gt::html()
}

As I keep it simple/lazy, I just want to copy my xRank table but with a gt instead of reactable. The code is a bit messy as I normally keep my gt tables simpler.

So the above function colours the ‘RANK’ and ‘xRANK’ column according to which of the two is better/worse. This is so that I can use it in the code below that creates the table. The function has some css in there as well.

library(gtExtras)
xptable %>%
  mutate(
    RANK_temp = RANK,
    RANK = map2(RANK, xRANK, add_rank_color),
    xRANK = map2(xRANK, RANK_temp, add_rank_color)
  ) %>%
  select(-RANK_temp) %>%
  gt() %>%
  gt_img_rows(columns = id, img_source = "web", height = 17) %>%
  cols_label(
    id = " "
  ) %>%
  fmt_number(
    columns = c(xGF,xGA,xGD,xPTS),
    decimals = 1
  ) %>%
  cols_align(
    align = "center",
    columns = c(P:xRANK)
  ) %>%
  tab_style(
    style = list(
      
      cell_borders(
        sides = "left",
        color = "black",
        weight = px(3)
      )
      
    ),
    locations = list(
      cells_body(
        columns = c(P,xGF)
      )
    )
  ) %>%
  tab_spanner(
    label = "LEAGUE TABLE",
    columns = c(
      RANK:PTS
    )
  ) %>%
  tab_spanner(
    label = "EXPECTED TABLE",
    columns = c(
      xGF:xRANK
    )
  )  %>%
  cols_width(
    c(xGF:xPTS) ~ px(60),
    #c(RANK,xRANK) ~ px(30),
   TEAM ~ 150,
    everything() ~ px(50)
  ) %>% tab_style(
    locations = cells_column_labels(columns = everything()),
    style     = list(
      cell_borders(sides = "bottom", weight = px(3)),
      cell_text(weight = "bold")
    )
  ) %>% tab_style(
    locations = list(
      cells_body(
        columns = c(PTS,xPTS)
      )
    ),
    style = list(
      cell_text(weight = "bold")
    )
  ) %>%
  opt_table_font(font = "Roboto Mono") %>%
  tab_options(
    row.striping.background_color = "#F6F8FA",
    row.striping.include_table_body = TRUE,
    data_row.padding = px(2),
    table.border.top.style = "hidden",
    #table.border.bottom.style = "hidden",
    table.font.size = "12px"
  ) %>%
  tab_header(md("**LEAGUE TABLE BASED ON EXPECTED POINTS**")) %>%
   tab_source_note(
    source_note = "xPoints calculated by simulating every shot in a match"
  )%>%
   tab_source_note(
    source_note = "Data from Opta via FotMob"
  )
LEAGUE TABLE BASED ON EXPECTED POINTS
LEAGUE TABLE EXPECTED TABLE
RANK TEAM P W D L GF GA GD PTS xGF xGA xGD xPTS xRANK
3
Norwich City 14 7 3 4 21 15 6 24 20.4 14.7 5.7 23.6
1
20
West Bromwich Albion 14 2 8 4 19 18 1 14 19.8 14.4 5.4 23.2
2
22
Middlesbrough 14 3 4 7 15 20 -5 13 18.2 13.2 5.0 23.0
3
2
Sheffield United 14 7 4 3 24 13 11 25 21.4 13.2 8.1 23.0
4
7
Luton Town 14 5 6 3 19 14 5 21 17.2 12.4 4.8 22.6
5
1
Burnley 14 6 7 1 24 12 12 25 17.5 14.0 3.5 21.9
6
10
Watford 14 5 5 4 19 17 2 20 17.1 14.5 2.6 21.0
7
15
Wigan Athletic 14 5 4 5 15 18 -3 19 16.7 14.7 2.0 20.3
8
8
Swansea City 14 6 3 5 17 20 -3 21 17.0 15.6 1.4 20.2
9
11
Millwall 14 6 2 6 17 18 -1 20 17.8 17.2 0.6 19.9
10
6
Reading 14 7 1 6 15 20 -5 22 15.2 14.6 0.6 19.9
11
4
Queens Park Rangers 14 7 3 4 20 16 4 24 16.3 15.8 0.4 19.5
12
9
Sunderland 14 5 5 4 19 14 5 20 15.0 15.3 −0.3 18.8
13
12
Birmingham City 14 5 4 5 14 12 2 19 15.2 17.6 −2.4 18.5
14
17
Bristol City 15 5 3 7 23 24 -1 18 18.6 21.8 −3.3 18.3
15
5
Blackburn Rovers 15 8 0 7 18 17 1 24 14.4 18.2 −3.9 18.3
16
14
Preston North End 15 4 7 4 8 10 -2 19 14.6 17.2 −2.5 17.9
17
18
Cardiff City 14 5 3 6 12 14 -2 18 13.7 13.9 −0.2 17.7
18
13
Stoke City 14 5 4 5 17 17 0 19 16.1 19.0 −2.9 17.5
19
19
Blackpool 14 4 4 6 16 20 -4 16 16.4 21.9 −5.5 15.7
20
16
Rotherham United 13 4 6 3 15 13 2 18 14.1 17.3 −3.2 15.4
21
24
Coventry City 11 2 4 5 9 14 -5 10 12.9 14.8 −1.9 13.5
22
23
Huddersfield Town 13 3 2 8 16 21 -5 11 14.1 19.4 −5.3 13.4
23
21
Hull City 14 4 2 8 13 28 -15 14 13.8 22.6 −8.9 12.5
24
xPoints calculated by simulating every shot in a match
Data from Opta via FotMob

SOME NOTES

As FotMob only has the minute for every shot, it is impossible to take rebounds into account. So if a team has two consecutive shots the xG is just summed up instead of first being factored with each other. This will skew some results, but I think it’s a small error and every team will profit/be disadvantaged by it in the long run.

Another point is that gamestate will be a big factor in these calculations. If you’re playing to just score and sit back, the opponent will gather more xG in most cases till they score the equalizer. After that it’s probably fair game again, but the results are skewed a little. If the equalizer comes earlier, I think the current gamestate is just as fair but the xG and therefore xPoints are much lower.


Support my work with a coffee

Share