Get your open access event data from statsbomb
This is the first of all the blog posts that are to be published further. In this tutorial we will learn how to download open access event data from statsbomb using the Python package statsbombpy
. Use pip
to install statsbombpy
by using the following command:
pip install statsbombpy
The open data from Statsbomb can be accessed without any need of authentication from the user but it is always advised to go through the Terms & Conditions section stated at their documentation page.
Now we will go step by step to understand how to extract the relevant data. Before that, we need to import the statsbombpy
package.
from statsbombpy import sb
We then import the numpy
and the pandas
packages that help us manipulate our datasets and perform analyses like data cleaning and data extraction.
import numpy as np
import pandas as pd
To get access to the Competitions dataset type the following:
comp = sb.competitions()
## credentials were not supplied. open data access only
The dataset comp
can now be printed by typing the following:
print(comp.to_markdown())
## | | competition_id | season_id | country_name | competition_name | competition_gender | season_name | match_updated | match_available |
## |---:|-----------------:|------------:|:-------------------------|:------------------------|:---------------------|:--------------|:---------------------------|:---------------------------|
## | 0 | 16 | 4 | Europe | Champions League | male | 2018/2019 | 2021-04-19T17:36:05.724116 | 2021-04-19T17:36:05.724116 |
## | 1 | 16 | 1 | Europe | Champions League | male | 2017/2018 | 2021-01-23T21:55:30.425330 | 2021-01-23T21:55:30.425330 |
## | 2 | 16 | 2 | Europe | Champions League | male | 2016/2017 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
## | 3 | 16 | 27 | Europe | Champions League | male | 2015/2016 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
## | 4 | 16 | 26 | Europe | Champions League | male | 2014/2015 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
## | 5 | 16 | 25 | Europe | Champions League | male | 2013/2014 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
## | 6 | 16 | 24 | Europe | Champions League | male | 2012/2013 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
## | 7 | 16 | 23 | Europe | Champions League | male | 2011/2012 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
## | 8 | 16 | 22 | Europe | Champions League | male | 2010/2011 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 9 | 16 | 21 | Europe | Champions League | male | 2009/2010 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 10 | 16 | 41 | Europe | Champions League | male | 2008/2009 | 2020-08-30T10:18:39.435424 | 2020-08-30T10:18:39.435424 |
## | 11 | 16 | 39 | Europe | Champions League | male | 2006/2007 | 2021-03-31T04:18:30.437060 | 2021-03-31T04:18:30.437060 |
## | 12 | 16 | 37 | Europe | Champions League | male | 2004/2005 | 2021-04-01T06:18:57.459032 | 2021-04-01T06:18:57.459032 |
## | 13 | 16 | 44 | Europe | Champions League | male | 2003/2004 | 2021-04-01T00:34:59.472485 | 2021-04-01T00:34:59.472485 |
## | 14 | 16 | 76 | Europe | Champions League | male | 1999/2000 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 15 | 37 | 42 | England | FA Women's Super League | female | 2019/2020 | 2021-04-28T19:48:01.172671 | 2021-04-28T19:48:01.172671 |
## | 16 | 37 | 4 | England | FA Women's Super League | female | 2018/2019 | 2021-04-28T19:48:01.166958 | 2021-04-28T19:48:01.166958 |
## | 17 | 43 | 3 | International | FIFA World Cup | male | 2018 | 2020-10-25T14:03:50.263266 | 2020-10-25T14:03:50.263266 |
## | 18 | 11 | 42 | Spain | La Liga | male | 2019/2020 | 2020-12-18T12:10:38.985394 | 2020-12-18T12:10:38.985394 |
## | 19 | 11 | 4 | Spain | La Liga | male | 2018/2019 | 2021-04-20T03:24:51.029365 | 2021-04-20T03:24:51.029365 |
## | 20 | 11 | 1 | Spain | La Liga | male | 2017/2018 | 2021-04-19T17:36:05.805404 | 2021-04-19T17:36:05.805404 |
## | 21 | 11 | 2 | Spain | La Liga | male | 2016/2017 | 2021-02-02T23:24:58.985975 | 2021-02-02T23:24:58.985975 |
## | 22 | 11 | 27 | Spain | La Liga | male | 2015/2016 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 23 | 11 | 26 | Spain | La Liga | male | 2014/2015 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 24 | 11 | 25 | Spain | La Liga | male | 2013/2014 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 25 | 11 | 24 | Spain | La Liga | male | 2012/2013 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 26 | 11 | 23 | Spain | La Liga | male | 2011/2012 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 27 | 11 | 22 | Spain | La Liga | male | 2010/2011 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 28 | 11 | 21 | Spain | La Liga | male | 2009/2010 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 29 | 11 | 41 | Spain | La Liga | male | 2008/2009 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 30 | 11 | 40 | Spain | La Liga | male | 2007/2008 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 31 | 11 | 39 | Spain | La Liga | male | 2006/2007 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 32 | 11 | 38 | Spain | La Liga | male | 2005/2006 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 33 | 11 | 37 | Spain | La Liga | male | 2004/2005 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 34 | 49 | 3 | United States of America | NWSL | female | 2018 | 2020-07-29T05:00 | 2020-07-29T05:00 |
## | 35 | 2 | 44 | England | Premier League | male | 2003/2004 | 2020-08-31T20:40:28.969635 | 2020-08-31T20:40:28.969635 |
## | 36 | 72 | 30 | International | Women's World Cup | female | 2019 | 2020-07-29T05:00 | 2020-07-29T05:00 |
We can extract the column names of comp
to understand the dataset better and draw out relevant information from the same. Type the following:
print(comp.columns)
## Index(['competition_id', 'season_id', 'country_name', 'competition_name',
## 'competition_gender', 'season_name', 'match_updated',
## 'match_available'],
## dtype='object')
Let us make sense of a particular row from the comp
dataset. For example, if we look into row number 18
we notice some relevant information: the competition_id
is 11
, the season_id
is 42, the country_name
is Spain
, the competition_name
is La Liga
, the season_name
is 2019/2020
, and so on. Suppose we are satisfied with the aboive information, and we want to analyze a game from 2019/20’s La Liga season. Obviously row number 18
from comp
provides us with that. We keep note of the competition_id
and season_id
at that row, which are 11
and 42
respectively. Now we extract out the matches dataset by typing the following:
mat = sb.matches(competition_id=11, season_id=42)
## credentials were not supplied. open data access only
print(mat.to_markdown())
## | | match_id | match_date | kick_off | competition | season | home_team | away_team | home_score | away_score | match_status | match_status_360 | last_updated | last_updated_360 | match_week | competition_stage | stadium | referee | data_version | shot_fidelity_version | xy_fidelity_version |
## |---:|-----------:|:-------------|:-------------|:----------------|:----------|:-----------------|:-----------------|-------------:|-------------:|:---------------|:-------------------|:---------------------------|:-------------------|-------------:|:--------------------|:--------------------------------|:------------------|:---------------|------------------------:|----------------------:|
## | 0 | 303421 | 2020-07-19 | 17:00:00.000 | Spain - La Liga | 2019/2020 | Deportivo Alavés | Barcelona | 0 | 5 | available | unscheduled | 2020-07-29T05:00 | | 38 | Regular Season | Estadio de Mendizorroza | J. Martínez | 1.1.0 | 2 | 2 |
## | 1 | 303493 | 2020-06-23 | 22:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Athletic Bilbao | 1 | 0 | available | unscheduled | 2020-07-29T05:00 | | 31 | Regular Season | Camp Nou | Jesús Gil | 1.1.0 | 2 | 2 |
## | 2 | 303516 | 2020-06-27 | 17:00:00.000 | Spain - La Liga | 2019/2020 | Celta Vigo | Barcelona | 2 | 2 | available | unscheduled | 2020-07-29T05:00 | | 32 | Regular Season | Abanca-Balaídos | G. Cuadra | 1.1.0 | 2 | 2 |
## | 3 | 303680 | 2020-07-11 | 19:30:00.000 | Spain - La Liga | 2019/2020 | Real Valladolid | Barcelona | 0 | 1 | available | unscheduled | 2020-12-18T12:10:38.985394 | | 36 | Regular Season | Estadio Municipal José Zorrilla | Antonio Mateu | 1.1.0 | 2 | 2 |
## | 4 | 303532 | 2020-06-16 | 22:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Leganés | 2 | 0 | available | unscheduled | 2020-07-29T05:00 | | 29 | Regular Season | Camp Nou | J. Martínez | 1.1.0 | 2 | 2 |
## | 5 | 303400 | 2020-01-25 | 16:00:00.000 | Spain - La Liga | 2019/2020 | Valencia | Barcelona | 2 | 0 | available | unscheduled | 2020-07-29T05:00 | | 21 | Regular Season | Estadio de Mestalla | Jesús Gil | 1.1.0 | 2 | 2 |
## | 6 | 303634 | 2020-07-16 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Osasuna | 1 | 2 | available | unscheduled | 2020-09-18T13:16:12.825671 | | 37 | Regular Season | Camp Nou | J. Sánchez | 1.1.0 | 2 | 2 |
## | 7 | 303479 | 2020-03-07 | 18:30:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Real Sociedad | 1 | 0 | available | unscheduled | 2020-07-29T05:00 | | 27 | Regular Season | Camp Nou | J. Martínez | 1.1.0 | 2 | 2 |
## | 8 | 303615 | 2020-07-08 | 22:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Espanyol | 1 | 0 | available | unscheduled | 2020-09-11T23:12:41.238499 | | 35 | Regular Season | Camp Nou | J. Munuera | 1.1.0 | 2 | 2 |
## | 9 | 303696 | 2020-06-30 | 22:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Atlético Madrid | 2 | 2 | available | unscheduled | 2020-07-29T05:00 | | 33 | Regular Season | Camp Nou | nan | 1.1.0 | 2 | 2 |
## | 10 | 303664 | 2019-12-14 | 16:00:00.000 | Spain - La Liga | 2019/2020 | Real Sociedad | Barcelona | 2 | 2 | available | unscheduled | 2020-07-29T05:00 | | 17 | Regular Season | Reale Arena | J. Alberola Rojas | 1.1.0 | 2 | 2 |
## | 11 | 303596 | 2019-12-18 | 20:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Real Madrid | 0 | 0 | available | unscheduled | 2020-07-29T05:00 | | 10 | Regular Season | Camp Nou | A. Hernández | 1.1.0 | 2 | 2 |
## | 12 | 303487 | 2019-11-09 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Celta Vigo | 4 | 1 | available | unscheduled | 2020-07-29T05:00 | | 13 | Regular Season | Camp Nou | G. Cuadra | 1.1.0 | 2 | 2 |
## | 13 | 303600 | 2019-10-29 | 21:15:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Real Valladolid | 5 | 1 | available | unscheduled | 2020-07-29T05:00 | | 11 | Regular Season | Camp Nou | J. Alberola Rojas | 1.1.0 | 2 | 2 |
## | 14 | 303548 | 2020-06-13 | 22:00:00.000 | Spain - La Liga | 2019/2020 | Mallorca | Barcelona | 0 | 4 | available | unscheduled | 2020-07-29T05:00 | | 28 | Regular Season | Iberostar Estadi | C. Del Cerro | 1.1.0 | 2 | 2 |
## | 15 | 303473 | 2019-10-06 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Sevilla | 4 | 0 | available | unscheduled | 2020-07-29T05:00 | | 8 | Regular Season | Camp Nou | Antonio Mateu | 1.1.0 | 2 | 2 |
## | 16 | 303610 | 2020-01-19 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Granada | 1 | 0 | available | unscheduled | 2020-07-29T05:00 | | 20 | Regular Season | Camp Nou | V. Pizarro | 1.1.0 | 2 | 2 |
## | 17 | 303652 | 2020-01-04 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Espanyol | Barcelona | 2 | 2 | available | unscheduled | 2020-07-29T05:00 | | 19 | Regular Season | RCDE Stadium | C. Del Cerro | 1.1.0 | 2 | 2 |
## | 18 | 303430 | 2019-09-24 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Villarreal | 2 | 1 | available | unscheduled | 2020-07-29T05:00 | | 6 | Regular Season | Camp Nou | R. De Burgos | 1.1.0 | 2 | 2 |
## | 19 | 303674 | 2020-06-19 | 22:00:00.000 | Spain - La Liga | 2019/2020 | Sevilla | Barcelona | 0 | 0 | available | unscheduled | 2020-07-29T05:00 | | 30 | Regular Season | Estadio Ramón Sánchez Pizjuán | José González | 1.1.0 | 2 | 2 |
## | 20 | 303470 | 2020-03-01 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Real Madrid | Barcelona | 2 | 0 | available | unscheduled | 2020-07-29T05:00 | | 26 | Regular Season | Estadio Santiago Bernabéu | Antonio Mateu | 1.1.0 | 2 | 2 |
## | 21 | 303700 | 2019-10-19 | 13:00:00.000 | Spain - La Liga | 2019/2020 | Eibar | Barcelona | 0 | 3 | available | unscheduled | 2020-07-29T05:00 | | 9 | Regular Season | Estadio Municipal de Ipurúa | M. Melero | 1.1.0 | 2 | 2 |
## | 22 | 303707 | 2020-02-09 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Real Betis | Barcelona | 2 | 3 | available | unscheduled | 2020-07-29T05:00 | | 23 | Regular Season | Estadio Benito Villamarín | J. Sánchez | 1.1.0 | 2 | 2 |
## | 23 | 303666 | 2019-09-21 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Granada | Barcelona | 2 | 0 | available | unscheduled | 2020-07-29T05:00 | | 5 | Regular Season | Estadio Nuevo Los Cármenes | G. Cuadra | 1.1.0 | 2 | 2 |
## | 24 | 303725 | 2020-07-05 | 22:00:00.000 | Spain - La Liga | 2019/2020 | Villarreal | Barcelona | 1 | 4 | available | unscheduled | 2020-07-29T05:00 | | 34 | Regular Season | Estadio de la Cerámica | C. Del Cerro | 1.1.0 | 2 | 2 |
## | 25 | 303504 | 2019-11-02 | 16:00:00.000 | Spain - La Liga | 2019/2020 | Levante | Barcelona | 3 | 1 | available | unscheduled | 2020-07-29T05:00 | | 12 | Regular Season | Estadio Ciudad de Valencia | A. Hernández | 1.1.0 | 2 | 2 |
## | 26 | 303715 | 2019-11-23 | 13:00:00.000 | Spain - La Liga | 2019/2020 | Leganés | Barcelona | 1 | 2 | available | unscheduled | 2020-07-29T05:00 | | 14 | Regular Season | Estadio Municipal de Butarque | S. Jaime | 1.1.0 | 2 | 2 |
## | 27 | 303377 | 2020-02-15 | 16:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Getafe | 2 | 1 | available | unscheduled | 2020-07-29T05:00 | | 24 | Regular Season | Camp Nou | G. Cuadra | 1.1.0 | 2 | 2 |
## | 28 | 303524 | 2019-12-01 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Atlético Madrid | Barcelona | 0 | 1 | available | unscheduled | 2020-07-29T05:00 | | 15 | Regular Season | Estadio Wanda Metropolitano | Antonio Mateu | 1.1.0 | 2 | 2 |
## | 29 | 303451 | 2019-12-07 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Mallorca | 5 | 2 | available | unscheduled | 2020-07-29T05:00 | | 16 | Regular Season | Camp Nou | J. Munuera | 1.1.0 | 2 | 2 |
## | 30 | 303517 | 2019-12-21 | 16:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Deportivo Alavés | 4 | 1 | available | unscheduled | 2020-07-29T05:00 | | 18 | Regular Season | Camp Nou | M. Melero | 1.1.0 | 2 | 2 |
## | 31 | 303682 | 2020-02-02 | 21:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Levante | 2 | 1 | available | unscheduled | 2020-07-29T05:00 | | 22 | Regular Season | Camp Nou | A. Cordero | 1.1.0 | 2 | 2 |
## | 32 | 303731 | 2020-02-22 | 16:00:00.000 | Spain - La Liga | 2019/2020 | Barcelona | Eibar | 5 | 0 | available | unscheduled | 2020-07-29T05:00 | | 25 | Regular Season | Camp Nou | C. Soto | 1.1.0 | 2 | 2 |
Once we have the matches dataset mat
, we can look into it’s column names to clearly discern any pertinent information.
mat.columns
## Index(['match_id', 'match_date', 'kick_off', 'competition', 'season',
## 'home_team', 'away_team', 'home_score', 'away_score', 'match_status',
## 'match_status_360', 'last_updated', 'last_updated_360', 'match_week',
## 'competition_stage', 'stadium', 'referee', 'data_version',
## 'shot_fidelity_version', 'xy_fidelity_version'],
## dtype='object')
Evidently, the mat
dataset gives us the match ids, the match dates, the kick off times, the home and away teams, the scores in a particular match, the name of the referee who officiated the match and so on. Here match_id
is the unique id that will help us draw out event data for a particular match from 2019/20’s La Liga season. Let us get the event data from a match. Well, let me announce something beforehand. I am a hardcore Real Madrid fan and an avid follower of European football. Focusing on the mat
dataset, it seems, row number 20
with the match_id = 303470
catches most of my attention. It was a Real Madrid vs. Barcelona match that took place at the Estadio Santiago Bernabéu stadium pre-pandemic and it ended up 2-0 in Real Madrid’s favor 👀 👀 👀 👀. A great feat to be honest! Let us obtain the event data for this match.
events = sb.events(match_id=303470)
## credentials were not supplied. open data access only
eh = events.head() # shows the first few rows
print(eh.to_markdown())
## | | bad_behaviour_card | ball_receipt_outcome | ball_recovery_recovery_failure | block_deflection | block_save_block | carry_end_location | clearance_aerial_won | clearance_body_part | clearance_head | clearance_left_foot | clearance_right_foot | counterpress | dribble_no_touch | dribble_outcome | dribble_overrun | duel_outcome | duel_type | duration | foul_committed_advantage | foul_committed_card | foul_committed_offensive | foul_committed_type | foul_won_advantage | foul_won_defensive | goalkeeper_body_part | goalkeeper_end_location | goalkeeper_outcome | goalkeeper_position | goalkeeper_success_in_play | goalkeeper_technique | goalkeeper_type | id | index | interception_outcome | location | match_id | minute | miscontrol_aerial_won | off_camera | out | pass_aerial_won | pass_angle | pass_assisted_shot_id | pass_body_part | pass_cross | pass_cut_back | pass_deflected | pass_end_location | pass_goal_assist | pass_height | pass_inswinging | pass_length | pass_miscommunication | pass_no_touch | pass_outcome | pass_outswinging | pass_recipient | pass_shot_assist | pass_switch | pass_technique | pass_through_ball | pass_type | period | play_pattern | player | position | possession | possession_team | related_events | second | shot_body_part | shot_deflected | shot_end_location | shot_first_time | shot_freeze_frame | shot_key_pass_id | shot_one_on_one | shot_outcome | shot_statsbomb_xg | shot_technique | shot_type | substitution_outcome | substitution_replacement | tactics | team | timestamp | type | under_pressure |
## |---:|---------------------:|-----------------------:|---------------------------------:|-------------------:|-------------------:|---------------------:|-----------------------:|----------------------:|-----------------:|----------------------:|-----------------------:|---------------:|-------------------:|------------------:|------------------:|---------------:|------------:|-----------:|---------------------------:|----------------------:|---------------------------:|----------------------:|---------------------:|---------------------:|-----------------------:|--------------------------:|---------------------:|----------------------:|-----------------------------:|-----------------------:|------------------:|:-------------------------------------|--------:|-----------------------:|-----------:|-----------:|---------:|------------------------:|-------------:|------:|------------------:|-------------:|------------------------:|-----------------:|-------------:|----------------:|-----------------:|--------------------:|-------------------:|--------------:|------------------:|--------------:|------------------------:|----------------:|---------------:|-------------------:|-----------------:|-------------------:|--------------:|-----------------:|--------------------:|------------:|---------:|:---------------|---------:|-----------:|-------------:|:------------------|:-----------------------------------------|---------:|-----------------:|-----------------:|--------------------:|------------------:|--------------------:|-------------------:|------------------:|---------------:|--------------------:|-----------------:|------------:|-----------------------:|---------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------|:-------------|:------------|-----------------:|
## | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 3eba28e7-64c2-4e95-ac89-5d9948154c1d | 1 | nan | nan | 303470 | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 1 | Regular Play | nan | nan | 1 | Real Madrid | nan | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | {'formation': 4141, 'lineup': [{'player': {'id': 3509, 'name': 'Thibaut Courtois'}, 'position': {'id': 1, 'name': 'Goalkeeper'}, 'jersey_number': 13}, {'player': {'id': 5721, 'name': 'Daniel Carvajal Ramos'}, 'position': {'id': 2, 'name': 'Right Back'}, 'jersey_number': 2}, {'player': {'id': 5485, 'name': 'Raphaël Varane'}, 'position': {'id': 3, 'name': 'Right Center Back'}, 'jersey_number': 5}, {'player': {'id': 5201, 'name': 'Sergio Ramos García'}, 'position': {'id': 5, 'name': 'Left Center Back'}, 'jersey_number': 4}, {'player': {'id': 5552, 'name': 'Marcelo Vieira da Silva Júnior'}, 'position': {'id': 6, 'name': 'Left Back'}, 'jersey_number': 12}, {'player': {'id': 5539, 'name': 'Carlos Henrique Casimiro'}, 'position': {'id': 10, 'name': 'Center Defensive Midfield'}, 'jersey_number': 14}, {'player': {'id': 6773, 'name': 'Federico Santiago Valverde Dipetta'}, 'position': {'id': 12, 'name': 'Right Midfield'}, 'jersey_number': 15}, {'player': {'id': 4926, 'name': 'Francisco Román Alarcón Suárez'}, 'position': {'id': 13, 'name': 'Right Center Midfield'}, 'jersey_number': 22}, {'player': {'id': 5574, 'name': 'Toni Kroos'}, 'position': {'id': 15, 'name': 'Left Center Midfield'}, 'jersey_number': 8}, {'player': {'id': 18395, 'name': 'Vinícius José Paixão de Oliveira Júnior'}, 'position': {'id': 16, 'name': 'Left Midfield'}, 'jersey_number': 25}, {'player': {'id': 19677, 'name': 'Karim Benzema'}, 'position': {'id': 23, 'name': 'Center Forward'}, 'jersey_number': 9}]} | Real Madrid | 00:00:00.000 | Starting XI | nan |
## | 1 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | e9a4b6da-b58f-4c56-a0e3-65bd23a17473 | 2 | nan | nan | 303470 | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 1 | Regular Play | nan | nan | 1 | Real Madrid | nan | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | {'formation': 442, 'lineup': [{'player': {'id': 20055, 'name': 'Marc-André ter Stegen'}, 'position': {'id': 1, 'name': 'Goalkeeper'}, 'jersey_number': 1}, {'player': {'id': 6374, 'name': 'Nélson Cabral Semedo'}, 'position': {'id': 2, 'name': 'Right Back'}, 'jersey_number': 2}, {'player': {'id': 5213, 'name': 'Gerard Piqué Bernabéu'}, 'position': {'id': 3, 'name': 'Right Center Back'}, 'jersey_number': 3}, {'player': {'id': 5492, 'name': 'Samuel Yves Umtiti'}, 'position': {'id': 5, 'name': 'Left Center Back'}, 'jersey_number': 23}, {'player': {'id': 5211, 'name': 'Jordi Alba Ramos'}, 'position': {'id': 6, 'name': 'Left Back'}, 'jersey_number': 18}, {'player': {'id': 11392, 'name': 'Arthur Henrique Ramos de Oliveira Melo'}, 'position': {'id': 9, 'name': 'Right Defensive Midfield'}, 'jersey_number': 8}, {'player': {'id': 5203, 'name': 'Sergio Busquets i Burgos'}, 'position': {'id': 11, 'name': 'Left Defensive Midfield'}, 'jersey_number': 5}, {'player': {'id': 8206, 'name': 'Arturo Erasmo Vidal Pardo'}, 'position': {'id': 12, 'name': 'Right Midfield'}, 'jersey_number': 22}, {'player': {'id': 8118, 'name': 'Frenkie de Jong'}, 'position': {'id': 16, 'name': 'Left Midfield'}, 'jersey_number': 21}, {'player': {'id': 5503, 'name': 'Lionel Andrés Messi Cuccittini'}, 'position': {'id': 22, 'name': 'Right Center Forward'}, 'jersey_number': 10}, {'player': {'id': 5487, 'name': 'Antoine Griezmann'}, 'position': {'id': 24, 'name': 'Left Center Forward'}, 'jersey_number': 17}]} | Barcelona | 00:00:00.000 | Starting XI | nan |
## | 2 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 0b6f982e-e92c-4732-bc96-b28176e28b28 | 3 | nan | nan | 303470 | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 1 | Regular Play | nan | nan | 1 | Real Madrid | ['2e806582-2185-47ab-a971-b2898b602ea7'] | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | Barcelona | 00:00:00.000 | Half Start | nan |
## | 3 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 2e806582-2185-47ab-a971-b2898b602ea7 | 4 | nan | nan | 303470 | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 1 | Regular Play | nan | nan | 1 | Real Madrid | ['0b6f982e-e92c-4732-bc96-b28176e28b28'] | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | Real Madrid | 00:00:00.000 | Half Start | nan |
## | 4 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 91ccd2c5-6471-4fa1-aab6-b8fbe18d57af | 2410 | nan | nan | 303470 | 45 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | 2 | From Free Kick | nan | nan | 121 | Barcelona | ['95da779c-a2df-4b0c-a486-661a30ccce63'] | 0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | Barcelona | 00:00:00.000 | Half Start | nan |
We see that we were able to get access to all the events from the Real Madrid vs. Barcelona match. There are 88
columns in events
. You can jot down the column names to get a clearer overview of what kinds of events to expect from the match.
print(events.columns)
## Index(['bad_behaviour_card', 'ball_receipt_outcome',
## 'ball_recovery_recovery_failure', 'block_deflection',
## 'block_save_block', 'carry_end_location', 'clearance_aerial_won',
## 'clearance_body_part', 'clearance_head', 'clearance_left_foot',
## 'clearance_right_foot', 'counterpress', 'dribble_no_touch',
## 'dribble_outcome', 'dribble_overrun', 'duel_outcome', 'duel_type',
## 'duration', 'foul_committed_advantage', 'foul_committed_card',
## 'foul_committed_offensive', 'foul_committed_type', 'foul_won_advantage',
## 'foul_won_defensive', 'goalkeeper_body_part', 'goalkeeper_end_location',
## 'goalkeeper_outcome', 'goalkeeper_position',
## 'goalkeeper_success_in_play', 'goalkeeper_technique', 'goalkeeper_type',
## 'id', 'index', 'interception_outcome', 'location', 'match_id', 'minute',
## 'miscontrol_aerial_won', 'off_camera', 'out', 'pass_aerial_won',
## 'pass_angle', 'pass_assisted_shot_id', 'pass_body_part', 'pass_cross',
## 'pass_cut_back', 'pass_deflected', 'pass_end_location',
## 'pass_goal_assist', 'pass_height', 'pass_inswinging', 'pass_length',
## 'pass_miscommunication', 'pass_no_touch', 'pass_outcome',
## 'pass_outswinging', 'pass_recipient', 'pass_shot_assist', 'pass_switch',
## 'pass_technique', 'pass_through_ball', 'pass_type', 'period',
## 'play_pattern', 'player', 'position', 'possession', 'possession_team',
## 'related_events', 'second', 'shot_body_part', 'shot_deflected',
## 'shot_end_location', 'shot_first_time', 'shot_freeze_frame',
## 'shot_key_pass_id', 'shot_one_on_one', 'shot_outcome',
## 'shot_statsbomb_xg', 'shot_technique', 'shot_type',
## 'substitution_outcome', 'substitution_replacement', 'tactics', 'team',
## 'timestamp', 'type', 'under_pressure'],
## dtype='object')
This completes our post on how to get access to open event data for a particular football match. We need to filter out only those events on which we want to perform advanced mathematical analyses and build conclusions. These concepts will start making more sense once we dive deeper into analyses in our future posts. In the next immediate post we will simply learn how to visualize a football pitch using the sophisticated package mplsoccer
.