Application of ARIMA and artificial neural networks in the statistical analysis and prediction of results of football matches in the Premier League Inglesa 2021-2022
DOI:
https://doi.org/10.14488/1676-1901.v25i3.4965Keywords:
Soccer results prediction, ARIMA, Artificial Neural Networks, English Premier LeagueAbstract
The use of statistical data and the creation of tools for predicting and analyzing results are constantly part of the reality found in sports field. In football, such tools have been treated as a way to acquire competitive advantages, both for clubs and for bettors. In order to obtain an efficient prediction model for football matches, the study sought to predict the results of the English Premier League played in the 2021-2022 season. Data corresponding to 70% of the games played in the championship in question were collected and, from them, the application and comparison of the Auto Regressive Integrated Moving Average model (ARIMA) with the use of Artificial Neural Networks (ANNs) was proposed. In its prediction, the ARIMA method provided a complete set of clash statistics, including the number of goals scored by the teams in each match, reaching 54.39% of the winners' accuracy and predicting 14.04% of the scores accurately. The model composed of ANNs, in turn, predicted through a programming code in the Python language only the winners of the clashes. In this way, a greater accuracy of 72.81% was obtained, visualized through the generation of a confusion matrix.
Downloads
References
ANDREWS, S Kevin et al. Analysis on Sports Data Match Result Prediction Using Machine Learning Libraries. Journal Of Physics: Conference Series, [S.L.], v. 1964, n. 4, p. 1-9, 1 jul. 2021. IOP Publishing. http://dx.doi.org/10.1088/1742-6596/1964/4/042085. Disponível em: https://iopscience.iop.org/article/10.1088/1742-6596/1964/4/042085/meta. Acesso em: 26 ago. 2022.
BABOOTA, Rahul; KAUR, Harleen. Predictive analysis and modelling football results using machine learning approach for English Premier League. International Journal Of Forecasting, [S.L.], v. 35, n. 2, p. 741-755, abr. 2019. Elsevier BV. http://dx.doi.org/10.1016/j.ijforecast.2018.01.003. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0169207018300116?via%3Dihub. Acesso em: 21 ago. 2022
BERRAR, Daniel; LOPES, Philippe; DUBITZKY, Werner. Incorporating domain knowledge in machine learning for soccer outcome prediction. Machine Learning, [S.L.], v. 108, n. 1, p. 97-126, 7 ago. 2018. Springer Science and Business Media LLC. http://dx.doi.org/10.1007/s10994-018-5747-8. Disponível em: https://link.springer.com/article/10.1007/s10994-018-5747-8. Acesso em: 30 set. 2022.
BUNKER, Rory P.; THABTAH, Fadi. A machine learning framework for sport result prediction. Applied Computing And Informatics, [S.L.], v. 15, n. 1, p. 27-33, jan. 2019. Emerald. http://dx.doi.org/10.1016/j.aci.2017.09.005. Disponível em: https://www.sciencedirect.com/science/article/pii/S2210832717301485. Acesso em: 13 ago. 2022.
BUTLER, David; BUTLER, Robert; EAKINS, John. Expert performance and crowd wisdom: Evidence from English Premier League predictions. European Journal Of Operational Research, [S.L.], v. 288, n. 1, p. 170-182, jan. 2021. Elsevier BV. http://dx.doi.org/10.1016/j.ejor.2020.05.034. Disponível em: https://www.sciencedirect.com/science/article/pii/S037722172030480X?via%3Dihub. Acesso em: 03 set. 2022.
CASIMIRO, María Pilar González. Análisis de series temporales: Modelos ARIMA. Bizkaia: Universidad del País Vasco, 2009. 165 p. Disponível em: https://addi.ehu.es/handle/10810/12492. Acesso em: 08 out. 2022.
HAYKIN, Simon. Neural networks: a comprehensive foundation. 3rd. ed. Upper Saddle River: Prentice Hall, 2009. 938p. Disponível em: https://dai.fmph.uniba.sk/courses/NN/haykin.neural-networks.3ed.2009.pdf. Acesso em: 22 set. 2024.
KOOPMAN, Siem Jan; LIT, Rutger. Forecasting football match results in national league competitions using score-driven time series models. International Journal Of Forecasting, [S.L.], v. 35, n. 2, p. 797-809, abr. 2019. Elsevier BV. http://dx.doi.org/10.1016/j.ijforecast.2018.10.011. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0169207018302048?via%3Dihub. Acesso em: 22 ago. 2022.
LAHVICKA, Jiri. Using Monte Carlo simulation to calculate match importance: the case of English Premier League. Munich Personal Repec Archive, Munich, p. 1-20, 01 set. 2012. Disponível em: https://mpra.ub.uni-muenchen.de/40998/. Acesso em: 21 ago. 2022.
LASEK, Jan; GAGOLEWSKI, Marek. Interpretable sports team rating models based on the gradient descent algorithm. International Journal Of Forecasting, [S.L.], v. 37, n. 3, p. 1061-1071, jul. 2021. Elsevier BV. http://dx.doi.org/10.1016/j.ijforecast.2020.11.008. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0169207020301849. Acesso em: 10 set. 2022.
MEN, Yanhua. Intelligent sports prediction analysis system based on improved Gaussian fuzzy algorithm. Alexandria Engineering Journal, [S.L.], v. 61, n. 7, p. 5351-5359, jul. 2022. Elsevier BV. http://dx.doi.org/10.1016/j.aej.2021.08.084. Disponível em: https://www.sciencedirect.com/science/article/pii/S1110016821006001. Acesso em: 26 ago. 2022.
PEREIRA, Roberto Augusto Lazzarotto. Modelagem matemática para previsão esportiva: uma aplicação no futebol nacional. 2018. 46 f. TCC (Graduação) - Curso de Licenciatura em Ciências Exatas Com Habilitação em Matemática, Universidade Federal do Paraná, Pontal do Paraná, 2018. Disponível em: https://acervodigital.ufpr.br/handle/1884/60661. Acesso em: 17 set. 2022.
SANTOS, Tatiana Fernanda Mousquer dos. Aplicação de séries temporais e redes neurais em um ambiente de computação em nuvem. 2014. 92 f. Dissertação (Mestrado) - Curso de Mestrado do Programa de Pós-Graduação em Engenharia de Produção, Centro de Tecnologia, Universidade Federal de Santa Maria, Santa Maria, 2014. Disponível em: https://repositorio.ufsm.br/handle/1/8316. Acesso em: 04 nov. 2022.
XAVIER, Thainá Santos. Previsão de séries temporais utilizando modelos clássicos e redes neurais artificiais. 2018. 46 f. TCC (Graduação) - Curso de Bacharel em Ciências no Domínio da Engenharia Elétrica, Centro de Engenharia Elétrica e Informática – Ceei, Universidade Federal de Campina Grande – UFCG, Campina Grande, 2018. Disponível em: http://dspace.sti.ufcg.edu.br:8080/xmlui/handle/riufcg/18949. Acesso em: 08 out. 2022.
YIANNAKIS, Andrew et al. Forecasting in Sport. International Review For The Sociology Of Sport, [S.L.], v. 41, n. 1, p. 89-115, mar. 2006. SAGE Publications. http://dx.doi.org/10.1177/1012690206063508. Disponível em: https://journals.sagepub.com/doi/abs/10.1177/1012690206063508. Acesso em: 09 out. 2022.
ZUCCO, Luiz Henrique. Implementação da previsão de demanda por meio de modelos matemáticos clássicos e de inteligência artificial na gestão de estoque de uma empresa importadora de componentes automotivos da linha pesada. 2019. 55 f. TCC (Graduação) - Curso de Engenharia de Produção, Universidade de Caxias do Sul, Caxias do Sul, 2019.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Revista Produção Online

This work is licensed under a Creative Commons Attribution 4.0 International License.
The Journal reserves the right to make spelling and grammatical changes, aiming to keep a default language, respecting, however, the style of the authors.
The published work is responsibility of the (s) author (s), while the Revista Produção Online is only responsible for the evaluation of the paper. The Revista Produção Online is not responsible for any violations of Law No. 9.610 / 1998, the Copyright Act.
The journal allows the authors to keep the copyright of accepted articles, without restrictions
This work is licensed under a Creative Commons License .
