ScoutAtlas
Closed beta
Engine 02 · Vision

A 90-minute match,
scouted in three.

Vision is a computer-vision pipeline that watches a match the way a great scout would — except in three minutes, in the same format every time, on every player on the pitch.

In one sentence

Drop in a match link.
Get a structured scouting report — and the highlight reel — in less time than a coffee.

Most clubs spend 4–6 hours per match per scout on video review. Vision does the first 80% of that work — tracking, event detection, role-adjusted clipping — so your scout starts with the questions, not the prep.

The output is deliberately standardized. Every Vision report follows the same structure: identification, role, in-possession, out-of-possession, set-pieces, psychometrics inferred from body language, and a flagged-moments index. Your reports become comparable. Finally.

Vision · sample outputMatch: Vitória SC vs FC Porto · 18 Apr · 90'
4
8
6
10
7
9
5
● Tracking · 22 players · 71'14
events.detected
71’02progressive_pass#10 → #9
71’08turnover#10 / press
71’14shot_attempt · GOAL#10 / RIGHT_FOOT
74’21high_press_recovery#10 / 6.2 m
83’39second_assist#10 → #7 → #9
Capabilities

What Vision sees that the human eye misses.

Player tracking

Every touch, every off-ball run, every defensive recovery — across all 22 players, full match.

Pose estimation

First-touch quality, body-shape on receiving, scanning frequency before touches.

Event detection

Progressive passes, line-breaking carries, press triggers, second balls — labelled and timestamped.

Auto-clipped reels

Per-player highlight + lowlight reel, generated from the events the player actually drove.

Role context

Reports adjust expectations to the role you’re scouting for — a #6 isn’t graded against a #10.

Cross-match consistency

Every report follows the same template, so a scout can compare 12 games in a sitting.

Inside the pipeline

The stack that watches the match.

Detection · YOLOv11

State-of-the-art object detection trained on football-specific datasets (SoccerNet, custom labelled clips). Real-time on a single GPU.

Tracking · ByteTrack

Multi-object tracking with re-identification. Players keep their IDs through occlusions, substitutions, and camera cuts.

Pose · YOLOv8-Pose

Keypoint estimation for body-shape, first-touch quality, and scanning. Thirty-one keypoints per player per frame.

Action heads · CNN+LSTM

3D spatio-temporal classifier trained on annotated event sequences from StatsBomb and SoccerNet. Outputs labels at 25 FPS.

Re-ID · Siamese networks

Maintains identity across camera angles and broadcasts. Lets us merge multiple sources into a single timeline.

Tactical layer · GNN

Graph neural network over player positions to infer formations, line-shape, and pressing triggers.

See Vision live

Send a match link. We’ll send the report.

During pilot, we run Vision live for any candidate club. You send a public match link — broadcast feed or tactical cam — and we send back the auto-report and clipped reels for any player you name.