VEX AI Robotics Competition Enhancement

01

Problem

Autonomous robots that cannot perceive their environment accurately cannot make good decisions — regardless of how sophisticated their decision-making algorithms are. The VEX AI Competition's existing robot platform includes GPS, an AI Vision System, and a Sensor Fusion Map, but all three share a critical limitation: they only capture what is directly in the robot's forward field of view. Objects that move out of that cone are frozen at their last known position in the map, which degrades rapidly in a dynamic match environment where game elements and opposing robots move continuously. More critically, the system has no concept of the Z-axis — it cannot determine how high it or another robot has elevated on the field's Elevation Bars, which directly determines end-of-match scoring. An AI system bounded by what it can observe will always make decisions that are, at best, as good as its sensory model — and this platform's sensory model was fundamentally incomplete.

02

Solution

This Georgia Tech CS 6675 research paper proposes a concrete hardware and software architecture that extends the VEX AI robot's spatial awareness beyond its current field-of-view limitations. The baseline design adds SpotFi — a WiFi-based localization algorithm — for precise real-time positioning without GPS drift, and an RGB-D depth camera for 3D environment reconstruction that captures full spatial geometry including elevation. A design refinement phase extended the single-robot architecture to a peer-to-peer decentralized coordination network, enabling multiple robots to share sensor data in real time and maintain a shared world model that no individual robot could construct alone. Evaluation metrics, failure modes, and degradation pathways were analyzed to give the proposed system a realistic engineering assessment rather than a theoretical one. The paper frames the improvements as actionable, testable enhancements grounded in existing technology — not speculative research.

03

Skills Acquired

VEX AI — the autonomous robotics competition platform that provided the system under analysis. Understanding the VEX AI stack — its GPS module, AI Vision System, and Sensor Fusion Map — was prerequisite to identifying where its perception model breaks down and what hardware additions would meaningfully address the gaps.
RGB-D Mapping — the use of RGB-D (color plus depth) cameras to reconstruct 3D spatial geometry in real time. RGB-D data provides the Z-axis awareness the existing vision system lacks, enabling the robot to track object elevation, assess Elevation Bar occupancy, and build a volumetric map of the field rather than a 2D projection.
SpotFi — a WiFi Channel State Information-based localization algorithm that estimates device position from signal propagation patterns across access points. SpotFi was proposed as the primary localization layer because it operates without line-of-sight requirements and achieves sub-meter accuracy in indoor environments where GPS signals degrade.
NVIDIA Jetson Nano — the embedded computing platform proposed to handle the onboard computer vision and depth processing workload. The Jetson Nano's GPU-accelerated inference capabilities make it appropriate for real-time RGB-D processing at the edge without requiring a connection to external compute resources.
P2P Networks — the decentralized coordination architecture proposed in the design refinement phase. A peer-to-peer network between robots allows each robot to share its local sensor observations with teammates, building a shared world model that reflects the full field state rather than each robot's limited individual perspective.
Sensor Fusion — the integration of data from multiple sensor modalities (GPS, vision, depth, WiFi localization) into a unified spatial representation. Sensor fusion is the technique that makes the proposed architecture work: no single sensor provides complete spatial information, but the combination addresses all identified gaps in the existing system.

What makes this architecture meaningful is not any single component — it is how they address specific, documented failure modes in the existing system.

04

Deep Dive

The VEX AI Robotics Competition pits fully autonomous robots against each other on a 12×12-foot field. No human drivers — just onboard AI making every decision in real time. The system already includes GPS sensors, an AI Vision System, and a Sensor Fusion Map. So what's the problem?

The Sensor Fusion Map only sees what's directly in front of each robot — like a spotlight. Objects that move out of that field of view are recorded at their last known position, not their current one. A robot repositioning itself toward a scoring element might arrive to find nothing there. And critically, the system has no concept of the Z-axis: it cannot determine how high it or another robot has elevated on the field's Elevation Bars, which directly impacts end-of-match scoring.

An AI system's decision-making ability is bounded by what it can observe. The VEX AI robot's spatial awareness was fundamentally limited — and this paper proposes a concrete hardware and software architecture to fix that.

The Problem Space

The VEX Robotics Competition (VRC) is the world's largest robotics competition according to the Guinness Book of World Records, with over 30,000 students competing. In 2019, VEX introduced an AI variant — the VEX AI Competition — where robots operate fully autonomously using onboard AI. Despite its ambition, the competition gained minimal popularity: the 2024 VEX AI World Championship hosted only 60 teams compared to 820 teams at the standard high school championship.

Figure 1. Metallic VEX V5 Robot at the VEX World Championship.

The game format is a 2-vs-2 match played over 2 minutes. Robots score by moving colored "Triballs" into goals and by elevating on their Alliance's Elevation Bars at the end — the higher the elevation, the more points. The existing VEX AI system architecture consists of three components:

GPS Sensor: Tracks the robot's 2D position (X, Y) on the field
AI Vision System: Recognizes and classifies field objects
Sensor Fusion Map: Combines GPS and vision data to build a timestamped map of object positions

The critical limitation: the Sensor Fusion Map's camera-based field of view is narrow and forward-facing. Objects that move after being timestamped are still recorded at their original position. The system also lacks any Z-axis awareness — it cannot measure its own elevation height relative to other robots when competing for Elevation Bar points.

VEX AI Dashboard showing the Sensor Fusion Map's limited spotlight field of view

Figure 2. The VEX AI Dashboard — the Sensor Fusion Map shows only what is directly in front of the robot, similar to a spotlight's field of view.

Need Finding: Three Match Scenarios

To scope the problem, I analyzed three scenarios at increasing complexity — all within a standard 2-minute 2-vs-2 match:

Scenario 1

Easy — Match Start

All robots and Triballs begin at fixed starting positions. No movement has occurred, so the Sensor Fusion Map's timestamped observations are completely accurate. The AI system performs at its best here because the world matches its internal model.

Starting areas for Red and Blue Alliance robots in a 2v2 VRC Over Under match

Figure 3. Starting areas for robots in a 2 vs. 2 VRC Over Under match — Red and Blue Alliance tiles dictate initial placement.

Scenario 2

Average — Midmatch (1 Minute In)

All robots and Triballs have moved significantly from their starting positions. Objects that moved out of a robot's field of view while it was looking elsewhere are now recorded at stale positions. The AI may navigate toward Triballs that are no longer there — missing scoring opportunities.

Overhead view of VRC match at the 1-minute mark showing scattered robots and Triballs

Figure 5. Movement of robots and Triballs halfway through the match — objects are far from their recorded starting positions.

Scenario 3

Hard — End of Match (Elevation Phase)

At the end of the match, robots must decide whether to attempt elevation on the Alliance's Elevation Bars. The height of elevation directly determines how many points are awarded. The existing system has no Z-axis sensor — it cannot know its own elevation height or compare it against a competing robot's elevation. This is a complete blind spot in the final, highest-stakes moments of every match.

Overhead view of VRC match at final seconds showing scattered field

Figure 6. Final positions of robots and Triballs at the end of the match — maximum complexity for the AI system.

A VEX robot elevated on its Alliance's Elevation Bar

Figure 7. A robot scored by elevating on the Alliance's Elevation Bar.

Example showing Red Alliance earning the most Elevation Points across four tiers

Figure 8. Elevation tier scoring — the higher the elevation, the more points awarded.

Baseline Design: SpotFi + RGB-D 3D Mapping

The root cause of all three scenarios is the same: the robot cannot see what it cannot see. My baseline design proposes adding two complementary technologies to the existing VEX AI system architecture:

Original VEX AI System Diagram showing V5 Brain, Jetson Nano, GPS Sensor, AI Vision System, and Intel RealSense Depth Camera

Figure 9. Original VEX AI System Diagram — the baseline architecture before proposed enhancements.

SpotFi — a WiFi-based decimeter-level localization algorithm developed at Stanford that dramatically improves positional accuracy by computing Angle of Arrival (AoA) from WiFi channel state information, without requiring additional hardware beyond the robot's existing V5 Radio
RGB-D Mapping — four Intel RealSense Depth Camera D457 units mounted on 10-foot posts at each corner of the field, continuously capturing 3D depth images of the entire playing surface in real time

Both are integrated with the NVIDIA Jetson Nano processor already embedded in the VEX AI system, which is purpose-built for edge AI inference.

RGB-D Cameras (×4 corners)
   ↓ Real-time 3D depth data (up to 6 m / 19.7 ft range)
NVIDIA Jetson Nano
   ↓ Runs SpotFi + RGB-D Mapping algorithms
Enhanced Sensor Fusion Map
   ↓ 3D field model with X, Y, Z coordinates
AI Vision System + GPS Sensor
   ↑ Combined: full spatial awareness + elevation detection

VEX AI System Diagram modified for Initial Baseline Design, adding SpotFi and RGB-D Mapping to the Jetson Nano

Figure 10. VEX AI System — Modified Architecture for Initial Baseline Design, integrating SpotFi and RGB-D Mapping.

Component	Role	Addresses
Intel RealSense D457 (×4)	3D depth capture of entire field	Stale object positions, Z-axis blindness
SpotFi Algorithm	Decimeter-level WiFi localization	GPS positional accuracy
NVIDIA Jetson Nano	Edge AI processing	Real-time computation of 3D map

Four RGB-D cameras stationed on 10-foot posts at each corner of the VRC field

Figure 11. RGB-D stations above all four field corners — full 3D coverage.

Two RGB-D cameras at opposite corners of the VRC field — cost-reduction alternative

Figure 12. Cost-reduction alternative: 2 cameras at opposite corners (~$998 vs. ~$1,996).

A cost-reduction alternative: using 2 RGB-D cameras at opposite corners rather than 4 reduces hardware cost from ~$1,996 to ~$998, at the expense of some 3D rendering coverage. Whether the cost savings outweigh the coverage reduction requires empirical evaluation.

Design Refinement: Peer-to-Peer Decentralized Coordination

The baseline design improves individual robot perception. The design refinement goes further: it makes all four robots on the field share what they see with each other, creating a distributed sensor network with no single point of failure.

VEX AI System Diagram modified for Design Refinement, adding P2P system with decentralized coordination between four robots

Figure 13. VEX AI System — Modified Architecture for Design Refinement, adding P2P decentralized coordination across all four robots.

Using the V5 Radio already on each robot, a Peer-to-Peer (P2P) system with decentralized coordination would allow each robot to broadcast its GPS position, Vision System observations, and Sensor Fusion Map data to all peers in real time. Each robot receives and integrates the field awareness of all four robots simultaneously.

Distributed computing: Data collection and processing load is spread across all four robots — faster throughput, no central bottleneck
Improved resilience: If one robot's sensors degrade, the other three continue to provide field awareness. No single failure cascades to the whole system
Full-field visibility: Combined with RGB-D cameras, the P2P network gives every robot access to 360° field awareness — not just its own forward-facing view
Better temporal decisions: Shared observations allow each AI to reason about where all objects are right now, not just where they were when last individually seen

Implementation requires downloading a P2P software application into VEXos (the V5 Brain's operating system). The application creates overlay networks, handles peer routing, distributes data, and achieves load balancing across all connected robots.

How to Evaluate It

To rigorously compare the baseline design against the design refinement, controlled experiments would be run across at least 50 identical 2-minute matches — same starting positions, same Triball placements, same field conditions. Three metrics would be tracked:

Metric 1

Gameplay Scoring Performance

Total points scored per match for each approach. Statistical trends would reveal whether the P2P refinement produces consistently higher scores across many runs — controlling for the inherent variability of dynamic gameplay.

Metric 2

CPU Runtime

Milliseconds required by the V5 Brain and Jetson Nano to execute specific tasks: navigating to a scoring goal, processing incoming sensor data, communicating with a peer robot via V5 Radio. Lower runtime indicates more efficient spatial reasoning.

Metric 3

3D Map Quality

Qualitative comparison of 3D renderings from the Intel RealSense D457 viewer — evaluating whether the P2P-enhanced system produces faster and more complete 3D field reconstructions than the baseline four-camera-only approach.

Intel RealSense D457 3D mapping rendering showing depth-captured scene

Figure 14. Intel RealSense D457 3D mapping rendering — used to qualitatively evaluate map quality across both designs.

Metric	Data Type	Storage
Match scores	Integer (points)	CSV
CPU runtime	Float (milliseconds)	CSV
Object positions	Float (X, Y, Z coordinates)	CSV + RAM (Jetson Nano)
3D spatial images	Depth image frames	Intel RealSense platform
Source code	VEXos application	V5 Brain + PC backup

System Resilience & Failure Modes

Any robust system design must account for component failure. For this architecture:

1–2 RGB-D cameras fail: Remaining cameras still provide adequate 3D coverage; match continues
3+ RGB-D cameras fail: 3D mapping is severely degraded; the system falls back to the original GPS + Sensor Fusion Map baseline
One robot's VEX AI System degrades: Under P2P, the three remaining robots continue sharing observations — partial degradation, not total failure. Without P2P, a single robot failure isolates completely
V5 Radio or WiFi antenna failure: P2P communication is severed for that robot; it operates on its own sensors only, reverting to baseline behavior

Security: no personal information is stored on the V5 Brain, Jetson Nano, or accompanying software. Compliance: all robots must conform to the official Over Under Game Manual rules at all times. Maintenance: all associated firmware and software must be kept up to date before each match.

Key Takeaways

The VEX AI system's core limitation is perceptual, not computational. The existing hardware is capable — it just cannot see the full field simultaneously or in 3D
RGB-D mapping solves the field-of-view problem by providing an always-on, overhead 3D view of the entire playing surface, independent of each robot's orientation
SpotFi improves localization accuracy at decimeter-level resolution using existing WiFi hardware — a high-value improvement with minimal added cost
P2P coordination is the multiplier. Individual robots with better sensors are good; four robots sharing a unified, real-time field model is fundamentally more capable
Graceful degradation matters in robotics. The layered fallback architecture ensures the system continues operating meaningfully even when individual components fail

What I Learned & Why It Matters to Employers

This project required thinking like a systems engineer: starting with a concrete failure mode, tracing it to a root cause (perceptual limitation), proposing a minimum-viable fix (SpotFi + RGB-D), and then designing a superior refinement (P2P coordination) with measurable evaluation criteria. The skills demonstrated here — need finding, system architecture, iterative design refinement, failure mode analysis — apply directly to any engineering role building complex, multi-component systems in the real world.

Conclusion & Reflections

The VEX AI Robotics Competition is an ambitious platform that pushes autonomous robotics into a competitive sports format. Its current limitations are not fundamental — they are engineering problems with engineering solutions. Adding 3D depth perception from four field-corner cameras, improving localization accuracy with SpotFi, and enabling P2P data sharing between robots creates a system where each robot's AI operates with full field awareness rather than a narrow forward-facing view.

The design refinement — P2P decentralized coordination — is particularly compelling because it transforms four independently operating robots into a collaborative network. The whole becomes greater than the sum of its parts: every robot benefits from every other robot's observations, and the system degrades gracefully rather than catastrophically when individual components fail.

Design Component	Problem Solved
SpotFi localization	Improved X/Y positional accuracy ✓
RGB-D 3D Mapping (×4 cameras)	Full-field 3D visibility, Z-axis awareness ✓
P2P Decentralized Coordination	Distributed field awareness, load balancing ✓
Graceful degradation fallbacks	Resilient to partial component failure ✓
Evaluation framework defined	Scoring, CPU runtime, 3D map quality ✓

VEX AI Robotics Competition:
3D Mapping, SpotFi & P2P Coordination

Problem

Solution

Skills Acquired

Deep Dive

The Problem Space