TheRobotBay Benchmark Methodology

Version: 1.0 Published: March 2026 Canonical URL: https://therobotbay.com/methodology.md

What This Document Is

This is the source of truth for how TheRobotBay Benchmark works. It explains what we measure, how we measure it, and why we made the choices we did. Anyone — buyer, seller, researcher, or developer — can read this to understand what the numbers mean.

1. The Core Idea

When you buy a laptop, you compare it against other laptops. You don't compare a laptop against a refrigerator — even if both run on electricity and have screens.

The same principle applies to robots. A humanoid robot and an industrial arm are built for completely different jobs. Comparing their "scores" head-to-head is not just unhelpful, it's misleading.

TheRobotBay Benchmark enforces one rule: you can only compare robots in the same category.

This keeps comparisons meaningful.

2. Robot Categories

Every robot in our database is assigned to exactly one of seven categories. The category determines which metrics are collected and how comparisons are scored.

Code	Category	What It Covers
CAT-01	Industrial / Arm Robots	Articulated arms, SCARA, delta, collaborative robots used in manufacturing
CAT-02	Service Robots	Delivery, hospitality, cleaning, security robots operating in human spaces
CAT-03	Humanoid Robots	Bipedal robots with human-like form: walking, manipulation, interaction
CAT-04	Drones / UAVs	Multirotor and fixed-wing aerial robots for commercial and industrial use
CAT-05	Mobile Ground Robots	Quadruped and wheeled robots designed for terrain navigation and inspection
CAT-06	Educational / Hobbyist	Programmable robots for learning, prototyping, and hobbyist use
CAT-07	Agricultural / Specialty	Robots for farming, underwater, hazardous environment, and niche tasks

A robot can only belong to one category. If a robot spans two categories, we assign it to the category that best represents its primary use case.

3. Data Layers

We collect data at three levels:

Layer 1 — Manufacturer Specifications Dimensions, weight, payload, speed, power. These come directly from official datasheets and product pages. We note the source and date.

Layer 2 — Standardized Benchmarks Performance tests run under controlled conditions. Examples: pick-and-place cycle time for arms, stair-climbing speed for humanoids, delivery success rate for service robots. Where we run these ourselves, we publish the protocol. Where we rely on published third-party results, we cite the source.

Layer 3 — Community Data Real-world numbers submitted by users, operators, and integrators. Community data is marked separately and never mixed with manufacturer specs. It is always attributed to its submitter.

4. Universal Fields

Every robot — regardless of category — must have these fields populated before a benchmark profile is considered complete:

Identity: Brand, model name, year released, country of origin
Physical: Height, width, depth, weight, IP rating, operating temperature range
Power: Power source, runtime, battery type (if applicable)
Computing: Programming languages supported, SDK availability, ROS compatibility
Sensors: Camera count, LiDAR presence, IMU, GPS
Commercial: MSRP (USD), pricing model, availability region

A profile missing required universal fields is marked as Incomplete and excluded from comparison tables.

5. Category-Specific Metrics

Beyond the universal fields, each category has its own set of performance metrics. These are the numbers that actually matter for that type of robot.

CAT-01: Industrial / Arm Robots

These robots are judged on precision and throughput. The metrics that matter most:

Metric Group	Key Metrics
Kinematics	Degrees of Freedom (DoF), Max Reach (mm), Repeatability (±mm)
Performance	Max TCP Speed (mm/s), Cycle Time (seconds), Payload Capacity (kg)
Actuation	Peak Joint Torque (Nm), Backlash (arcmin), Encoder Resolution (bits)
Safety	Safety Category (ISO 10218-1), E-Stop Response Time (ms), SIL/PLe Rating

Radar chart axes: Payload · Reach · Repeatability · Speed · Safety

CAT-02: Service Robots

These robots operate alongside people. Reliability and autonomy matter more than raw speed.

Metric Group	Key Metrics
Navigation	Navigation Type, Max Speed (m/s), Indoor Positioning Accuracy (cm)
Payload	Cargo Payload (kg), Number of Compartments, Compartment Volume (L)
Autonomy	Autonomy Level (1–5), Delivery Success Rate (%), MTBF (hours)
Power	Runtime (min), Hot-swap Battery

Radar chart axes: Speed · Payload · Autonomy · Reliability · Runtime

CAT-03: Humanoid Robots

Humanoids are judged on locomotion, manipulation, and interaction capability.

Metric Group	Key Metrics
Locomotion	Total DoF, Walking Speed (m/s), Stair Step Height (mm), Max Slope (°)
Balance	Balance Recovery Force (N), Balance Recovery Time (ms)
Manipulation	Arm Payload (kg), Arm Reach (mm), Grasp Success Rate (%), Grip Force (N)
Perception	Object Recognition Accuracy (%), Task Generalization Score (/10)

Radar chart axes: Speed · Dexterity · Strength · Balance · Endurance

CAT-04: Drones / UAVs

Drones are judged on flight performance, endurance, and mission capability.

Metric Group	Key Metrics
Flight	Max Speed (m/s), Max Altitude (m), Wind Resistance (m/s)
Endurance	Flight Time (min), Max Range (km)
Payload	Payload Capacity (kg), Camera / Sensor Suite
Control	Transmission Range (km), Obstacle Avoidance Capability

Radar chart axes: Speed · Range · Payload · Endurance · Stability

CAT-05: Mobile Ground Robots

Quadrupeds and wheeled ground robots are judged on terrain capability and endurance.

Metric Group	Key Metrics
Locomotion	Max Speed (m/s), Max Slope (°), Stair Step Height (mm)
Payload	Payload Capacity (kg)
Power	Runtime (min), Hot-swap Battery
Autonomy	Autonomy Level (1–5), Navigation Type

Radar chart axes: Speed · Payload · Agility · Endurance · Terrain

CAT-06: Educational / Hobbyist

Educational robots are judged on how easy they are to program and how capable they are for the price.

Metric Group	Key Metrics
Programmability	Languages Supported, SDK Quality, ROS Compatible
Hardware	DoF, Payload (if any), Sensor Count
Usability	Setup Time (min), Community & Documentation Score (/10)
Value	MSRP (USD), Runtime (min)

Radar chart axes: Programmability · Modularity · Ease of Use · Performance · Value

CAT-07: Agricultural / Specialty

Specialty robots are highly variable. Metrics are scoped to the robot's primary mission.

Metric Group	Key Metrics
Mission	Task Type, Operating Area (m²/hr or ha/hr), Task Success Rate (%)
Physical	Payload (kg), IP Rating, Operating Temp Range
Power	Runtime (min), Power Source
Autonomy	Autonomy Level (1–5), Navigation Type

Radar chart axes: Efficiency · Payload · Endurance · Precision · Durability

6. The Overall Score

Each robot receives an Overall Score — a weighted composite calculated from its category-specific metrics. There is no upper cap on the score.

Today's robots typically score in the hundreds. As robotics technology advances, newer robots will score higher — potentially in the thousands or beyond. This is intentional. The score is designed to grow with the field, not compress everything into an artificial ceiling. Think of it the same way CPU benchmark scores work: a score from 2015 and a score from 2030 sit on the same continuous scale, and the newer hardware simply scores higher.

The score is not a simple average. Each metric is weighted by its priority level:

Priority	Weight
CRITICAL	40% of score pool
HIGH	35% of score pool
MEDIUM	25% of score pool

Missing data reduces the score. A robot with 100% data completeness will always outscore a robot with 60% completeness, all else being equal. This creates a direct incentive for manufacturers and community contributors to fill in the gaps.

Scores are only comparable within a category. A CAT-01 score of 820 and a CAT-03 score of 820 tell you nothing about each other.

7. How Comparisons Work

When you use the Compare tool at /benchmark/compare, these rules apply:

Same category only. Once you add the first robot, only robots in the same category can be added.
Up to 4 robots at once. More than 4 becomes unreadable.
Winner highlighting. For each metric row, the best value is highlighted in gold. For numeric metrics, highest wins — except for metrics where lower is better (weight, cycle time, repeatability), where lowest wins.
Missing data. A dash (—) means the data is not in our database yet. It does not mean zero.

8. Data Sources and Trust Levels

Not all data is equal. We label every data point with its source:

Label	Meaning
Manufacturer	Taken directly from the official datasheet or product page
Lab — TheRobotBay	Measured by TheRobotBay under our published test protocol
Lab — Third Party	Published by an independent lab (cited by name)
Community	Submitted by a verified user or operator
Estimated	Derived or calculated from other known values

When you see a metric without a source label, it is from the manufacturer by default.

9. Community Submissions

Anyone with an account can submit data for any robot via /benchmark/[robot-slug]/submit. Submitted data goes through three stages:

Submitted — In the queue, not yet visible on the profile.
Under Review — A moderator is checking it against known specs or other sources.
Verified — Accepted and merged into the profile, attributed to the contributor.

Community data that contradicts manufacturer specs is not automatically rejected — real-world performance often differs from spec sheet claims. We show both, labeled clearly.

10. What We Don't Do

We do not accept payment to change scores or rankings.
We do not allow manufacturers to remove negative data submitted by the community.
We do not fabricate missing data points. A missing field stays blank.
We do not compare robots across categories in ranked tables.

11. Version History

Version	Date	Notes
1.0	March 2026	Initial public release

Questions or corrections? Use our contact form or email hello@therobotbay.com.