How we measure “stylistic fit” without overfitting nostalgia
Match clusters players by behaviour, not biography. The math behind why “a left-back like Marcelo” is a useful comparison and how we keep it honest.
By Match Engine team
“We need a left-back like Marcelo.” It’s a useful sentence and a dangerous one. Useful because it instantly conveys a profile every football brain in the room can picture. Dangerous because it can mean five different things to five different listeners — and worse, it can mean nothing precise to a search engine.
Match, our recommendation engine, lives at the boundary between those two truths. It has to translate fuzzy operator language into rigorous, comparable, defensible rankings — without flattening what makes a player distinct. Here is how we do it without overfitting nostalgia.
The three-layer model
Stylistic fit at Scout Atlas is not a single similarity score. It is a stack of three independent layers, each computed nightly, each explained in plain English alongside the result.
Layer 1 — Behavioural fingerprints
For every player with at least 900 league minutes in the last two seasons, we compute a 200-dimensional behavioural vector. Not raw stats. Behavioural derivatives: progressive carry distance per touch, defensive zone activity skewed for opposition strength, scanning frequency before progressive passes, post-loss recovery distance.
These are the features that survive normalisation across leagues. A 70-minute League of Ireland match isn’t the same canvas as a Premier League match — so we normalise opportunities, not outcomes. The fingerprint compares behaviour at parity.
Layer 2 — Role context
“A left-back like Marcelo” isn’t just a behavioural shape. It’s a behavioural shape in a system. We tag every match in our corpus with the player’s implied role (inverted full-back, classical full-back, wing-back in a back five, hybrid wide centre-back) using a graph-based formation classifier. Stylistic similarity is then computed conditional on role — so a Bayern hybrid is compared against other hybrids, not against an Atalanta wing-back.
Layer 3 — Decision signature
The third layer is the most experimental and the one we’re most excited about. We train a sequence model on labelled decision points — receive-under-pressure, defensive-press-trigger, transition-spring — and produce a probability distribution over decision classes for each player. The decision signature captures what a player tends to do when given a choice. Two players with identical behavioural fingerprints can have completely different decision signatures, and the signature usually predicts how the player adapts to a new system.
Three things we explicitly don’t do
Every recommendation engine is shaped by what it refuses to do. Match has three firm refusals.
- We don’t train on private member-club data without consent. The fingerprints come from public open-data corpora and licensed event data. Member clubs’ private notes, GPS, and shortlists are theirs — they enrich a club’s personal model, not the cross-club one.
- We don’t hide the leagues a brief covered. If a brief filtered to the top 5, we say so on every result. If a player wasn’t included, we tell you why (insufficient minutes, league not yet ingested).
- We don’t pretend a 60-confidence ranking is a 95. When the ensemble disagrees — XGBoost likes a player, CatBoost is unsure — we flag the variance directly. Low confidence is itself a signal worth surfacing.
How we keep the comparisons honest
Two safeguards run alongside every Match score.
The first is survivor bias correction. Football media gravitates to winners. Behavioural similarity to a famous player can be a dangerous proxy — it’s a great filter for picking up on retrospective genius and a poor filter for predicting future fit. We rebalance training cohorts to include the “noisy middle” — players who looked like a star, and didn’t become one — explicitly.
The second is cohort calibration. We test the model not on the Premier League golden child, but on the Allsvenskan winger nobody had heard of in 2021 who is now a Bundesliga regular. If the model couldn’t have surfaced him with high confidence in 2021, we go back to the drawing board. Most “similarity” engines celebrate the players they predicted; we measure ours by the players they missed.
What you actually see in the product
When you open a player in Scout Atlas, “Stylistic peers” shows the top six players across our corpus by combined fingerprint + role + decision similarity, with a feature-attribution breakdown for each pair: where the similarity is concentrated, where it diverges. You see the comparison and the limits of the comparison.
“A left-back like Marcelo” becomes useful again — but you no longer have to take it on faith. The math is on the page.
Keep reading
The transfer window is broken — and the tools made it worse
Why a market with €7B annual flow still runs on Excel, WhatsApp, and gut feel. And what changes when the data layer catches up.
Agents as network, not noise
Why filtering agents out is the lazy answer — and what changes when you verify the integrity ones and price out the unverified middle.
If this resonated, the next move is a conversation.
We onboard pilot members on rolling invitation. Send us your hardest question — we’ll send back the live answer.