A Quantum Prologue
A quantum revolution in knowledge is already a progressive reality, making evident the prospects for quantum computing and ushering in a new era of quantum information science. On the one hand, this reality is promising for the future of data-driven modelling and simulations. On the other hand, outcomes that we have always assumed to be deterministic turn out to be shrouded in the mystery of probabilities. We are not talking here about the scientific proof by the 2022 Nobel laureates in physics, which debunked Einstein’s doubt about quantum entanglement, but about football at its climax on the global stage – the FIFA World Cup. Many people refer to football as a game of chance. Findings from mathematical models of soccer, however, prove that we can rely on prediction models to narrow uncertainties in the final outcome, if only we can determine the right variables and parameterise them precisely.
Support for Germany at the World Cup Tournament
It is now evident that Germany gets many well-wishers from Africa at the World Cup tournament, thanks to the long-term footprint of DAAD programmes, and not least the Centres of African Excellence. Taita Taveta University staff and students were among the well-wishers, the university being the host to the Kenyan-German Centre of Excellence for Mining, Environmental Engineering and Resource Management (CEMEREM). There is a compelling case for using sports modelling to popularise STEM subjects at school and to influence policy development to favour national teams to excel at international tournaments. Read on to know how and why.
“𝔻𝔼𝕌𝕋𝕊ℂℍ𝕃𝔸ℕ𝔻 𝕒𝕘𝕒𝕚𝕟𝕤𝕥 ℂ𝕆𝕊𝕋𝔸 ℝ𝕀ℂ𝔸. 𝔼𝕧𝕖𝕟 𝕥𝕙𝕖 𝕞𝕠𝕤𝕥 𝕢𝕦𝕚𝕩𝕠𝕥𝕚𝕔 𝕤𝕚𝕞𝕦𝕝𝕒𝕥𝕚𝕠𝕟 𝕤𝕥𝕚𝕝𝕝 𝕘𝕚𝕧𝕖𝕤 𝔾𝕖𝕣𝕞𝕒𝕟𝕪 𝟙% 𝕒𝕕𝕧𝕒𝕟𝕥𝕒𝕘𝕖 𝕠𝕧𝕖𝕣 ℂ𝕠𝕤𝕥𝕒 ℝ𝕚𝕔𝕒. 𝕌𝕟𝕘𝕝𝕒𝕦𝕓𝕝𝕚𝕔𝕙𝕖 𝕎𝕖𝕝𝕥𝕒𝕟𝕤𝕔𝕙𝕒𝕦𝕦𝕟𝕘!”
Such were the social media and blog posts on Facebook, Twitter, and IBD World Cup Edition from a less public version of the World Cup prediction model conceptualised and calibrated in 2018 by a Taita Taveta University (TTU) lecturer, then applied to predict the 2018 World Cup victory for France against Croatia – giving France a mean marginal advantage of 1.9% over Croatia. The 2022 World Cup was another opportunity to entertain fans mathematically with the exhilarating model predictions, and they demanded so even as TTU staff and students were still held up in the busy end-semester examinations. We had to bow to the pressure. Students had to create some time to watch some matches against the predictions, too. The match prediction journey had to set off anyway, beginning with Saudi Arabia against Poland in the group stages, hence 24 matches in that category. In total, the model predicted the outcome of 40 matches at the 2022 World Cup tournament. The model predictions were amazing, so was the unique value proposition of the model that set it apart from the more publicised prediction models attributed to FIFA, Oxford University, several US universities, and Gondwana University, among others. As such, Kenya was not left behind in the parade of World Cup modelling ingenuity.
Soccer and World Cup Modelling
When it comes to World Cup prediction models, more in the limelight in 2022 has been the prediction model attributed to an Oxford University mathematician. On the fun side, Paul the Octopus also became a World Cup prediction celebrity in 2010, though not based on any scientific grounds. Prof. David Sumpter has, however, done a commendable job in mathematical modelling of soccer, which has culminated in his postgraduate programme named Soccermatics.
Several attempts at World Cup modelling have been made in 2022, the Oxford University’s model being a key example for the 2022 World Cup that used regressions to predict a win for Brazil. Dan Williams of Cambridge Intelligence used graph theory with algorithms based on the international connections of the team players to predict a win for France in 2022. Astrological predictions also featured, giving Brazil or Germany a chance to win the 2022 World Cup (Gondwana University website). The use of Poisson distribution models in Python also featured from data scientists, one example by Frank Andrade predicting a win for Brazil. Data scientists Ronnie Das and Wasim Ahmed published in The Conversation a title “World Cup 2022: crunching 150 years of big data to predict the winner”. They predicted a win for France at 55.3% against Argentina (44.7%). These models tend to predict the winners but hardly relate the percentage differences to the degree of closeness in the competition or the goal differences to expect.
Unique Features of the AWCuPreM Model
Though the AWCuPreM model appreciates historical precedent, it does not allow history to enslave its approach but instead masters a selective application of history to the relevant cases and variables only.
Named AWCuPreM, this mathematical prediction model uses mean marginal percentages based on nine variables to predict the winner for each match: climate, resistive nucleus, tactical inventiveness, honed skillset, serendipity stroke, score drive, mentality premium, team coherence, and tenacity gradient. The nine variables are quantifiable derivatives of a systems-thinking approach, which works from the whole/big picture to parts/granular levels, a sharp contrast to the common regression models that depend heavily on historical precedent and related AI-based simulations. Systems thinking empowers a modeller to apply quantitative thinking – the ability to quantify abstract aspects by mathematically modelling relationships between causes and effects instead of pursuing the utopian goal of “measurement thinking”, which in vain seeks for the elusive “perfectly” measured data. Yes, we can use a scale to quantify levels such as reputation, passion, enthusiasm, or trust, among others, as Prof. Jay Forrester confirmed in his lectures as the pioneer of System Dynamics.
The main working knowledge AWCuPreM applies draws on probability and statistics with mathematical goal expectations (xG), permutations, set theory, geometry and spatial thinking, numerical analysis, weighted multicriteria decision analysis, projectile motion, and multiple reflection matrices, among others – complemented by common sense. As a mentorship opportunity, young Kenyan graduates are selected to participate in data collection to inform current strengths and weaknesses in the team compositions, but with some reflections on evidence-based historical precedent. Advanced modelling expertise and at times some expert elicitation surveys then help in parameterising the model for each match.
Though the AWCuPreM model appreciates historical precedent, it does not allow history to enslave its approach but instead masters a selective application of history to the relevant cases and variables only. For example, it is appreciated that Pele, Maradona, Ronaldo, Podolski, Gotze, Klose, and so on were great players, but does that translate to practical advantages for the current team compositions? Again, ball possession is not an assurance of efficiency in terms of shot conversion into goals, yet goals are the accounting units that matter as whole numbers irrespective of the glamour of the incoming vector the ball traces into the net. Choosing crucial variables, therefore, deserves top priority.
In 2022, AWCuPreM was recalibrated against 24 group-stage matches and enhanced to predict goal differences for three scenarios during each match: business as usual (BAU), surprising scenario (SS), and augmented favour scenario (AFS). The model has established the following mathematical inequalities to predict a draw or a range of goal differences for the entire match, however long it takes – up to penalty shootouts.
AWCuPreM presents a unique value proposition to soccer lovers, which sets it apart from other prediction models as follows:
- It gives three simulations for each match to contain the predictions within a narrow bandwidth of expected scenarios.
- It gives quantifiable margins whose magnitudes reflect the degree of closeness of the fight between the teams, hence preparing fans to watch with expectation.
- It uses established mathematical inequalities to predict a draw or goal difference within the normal match time, and the eventual goal difference if the game goes all the way into extra time and/or penalty shootouts.
- It has taken care of lucky chances under the variable named serendipity, which has been determined to be applicable to the World Cup 2022 in the ratio 7:3 for the battling teams.
By working with well-defined variables characterising football dynamics in the pitch as a probability space accommodating random variables, the model is an improvement from the common “black-box” regression models to a “grey-box” model, complete with a prediction of goal difference within a compact bandwidth of three scenarios for any match.
Realised Prediction Success Rate
Overall, the success rate of AWCuPreM was 70% for all the 24 group matches the model was applied to, 80% for 20 group matches after excluding four group matches whose results were total surprises to anyone (unglaublich as Germans would put it, or lich according to a Kenyan dialect), 100% for the knockout stages, 90% for the quarterfinals, 100% for the semifinals, and 100% for the final match of 18th December 2022.
Twitter spaces on the match predictions were organised by the model developer and moderated by a journalist from Kenya’s Nation Media Group, then co-moderated by TTU students and alumni as well as alumni of other Kenyan universities. One interesting Twitter space session co-moderated by Imelda Nasubo, an alumna of TTU’s School of Mines and Engineering, declared that one would be safer from a heart attack by banking hopes on Argentina instead of France. Because France enjoyed massive support in Kenya, many people waited to prove the model prediction wrong.
Using Argentina’s mean marginal percentage against France and a multiple reflection matrix, AWcuPreM was used to predict that Argentina was better placed to win the 2022 World Cup against a slippery and razor-thin margin of 0.5% – which meant a draw during the normal match time and a goal difference of 2-3 in the AFS scenario for Argentina, and it happened as shown in the graph!
Here is an excerpt from AWCuPreM as posted on 13th December 2022:
Mathematician’s model suggesting Argentina is hanging on a slippery and razor-thin 0.5% mean marginal advantage over France in the World Cup Final. Based on the model’s goal conversion factor, this isn’t enough for a goal difference and points to a draw within normal match time. If serendipity swaps in favour of France in the established model ratio of 7:3, penalties included, then France can be sure of a goal difference of 2 and at most 3 over Argentina. Finally, the worst scenario that will stun France is where serendipity favours Argentina, hence swapping that 2-3 goal difference in favour of Argentina.
The multiple reflection matrix below was applied to the four teams declared for the semifinals, treating them as near-equals (equiprobable chances) because of the sieving process that narrows the margins towards the final stage. Again, Argentina convincingly emerged the best placed team to win with positions 1, 4, or 1 in the possible combinations against the three model scenarios with their respective mean marginal percentages.
Lessons and Policy Recommendations
Innovative and creative application of models can help turn sports and similar extracurricular activities into opportunities for popularising STEM subjects. AWCuPreM has excited a great passion for mathematics and sciences among young Kenyan students and will continue to find key applications during STEM-oriented youth mentorship sessions. The World Cup is strategic in influencing young learners to develop a deep passion for mathematics and applied sciences, but scholars must rise to the challenge of developing models that can demonstrate the triumph of scientific thinking at narrowing uncertainties and debunking guesswork.
With advances in big data, 5G, AI, IoT, and so on, scholars need to come together and develop more accurate and visually engaging models for the 2026 World Cup and beyond. Science has huge opportunities in preparing teams to practice with precision and play to win football matches. To researchers on a mission, the randomness that has seen many people refer to soccer as a game of chance can only provide rich ground for applying modelling and probability to narrowing uncertainties towards laser-like predictions.
At the policy level, the model advances the need for parametric policymaking in matters of sports – parametric platforms for decision-making as opposed to the tradition of relying on broad qualitative expressions of need. It is now evident that African nations such as Kenya that are keen on featuring their teams at the World Cup by 2030 must ensure a serious scientific representation in the advisory boards of their Ministries in charge of sports. For Germany, it is high time determined efforts were made to return the nostalgic glory so not as to disappoint the growing body of supporters under the global DAAD footprint at the 2026 World Cup.
Further reading links: https://bit.ly/3uPDleJ
You must be logged in to rate posts.
Quite a refreshing read! I like the fact that this highlights the fact that the science we do has relevance in day-to-day lives. Keep up the good work!
Thank you, Alice! Let’s proceed with the advocacy.