Wherobots Benchmark Results

Monthly benchmark results from Wherobots performance benchmarks, comparing spatial analytics performance across multiple cloud platforms.

Repository Structure

benchmark-results/
├── tpch/                          # TPC-H benchmark results
│   ├── parquet/
│   │   ├── sf100/YYYY-MM.parquet  # Scale factor 100
│   │   └── sf1000/YYYY-MM.parquet # Scale factor 1000
│   └── csv/
│       ├── sf100/YYYY-MM.csv
│       └── sf1000/YYYY-MM.csv
├── spatial/                       # SpatialBench benchmark results
│   ├── parquet/
│   │   └── ...
│   └── csv/
│       └── ...
├── short-spatial/                 # ShortSpatialBench results (if applicable)
│   └── ...
└── metadata/
    └── YYYY-MM.json               # Run metadata (timing, CLI args, platform configs)

File Formats

Parquet (Full Schema)

Machine-readable format with complete result data:

Column	Type	Description
`query`	string	Query identifier (e.g., `q1`, `q2`)
`platform`	string	Platform name (e.g., `WDB-1`, `EMR`)
`benchmark`	string	Benchmark name (e.g., `TPC-H`, `SpatialBench`)
`scale_factor`	int64	Data scale factor (e.g., `100`, `1000`)
`final_status`	string	Query execution status (`SUCCESS`, `ERROR`, `TIMEOUT`)
`runtime`	float64	Query runtime in seconds (null if failed)
`result_count`	int64	Number of result rows (null if failed)
`cost`	float64	Estimated cost in USD (null if unavailable)
`execution_name`	string	Run identifier (typically `YYYY-MM-DD`)
`exception`	string	Exception message if query failed (null otherwise)
`platform_config`	string	Platform configuration as JSON string

CSV (Simplified Schema)

Human-readable format, same as parquet but without exception and platform_config columns.

Metadata JSON

Each run produces a metadata file containing:

execution_name: Run identifier
start_time / end_time: ISO 8601 timestamps
duration_seconds: Total run duration
cli_args: CLI arguments used for the run
platform_configs: Per-platform configuration details

How Results Are Published

Results are automatically published by the benchmark-dashboard CI pipeline after each benchmark run. The automated benchmark runs on the 1st of every month.

Reading the Data

import pyarrow.parquet as pq

# Read a specific month's results
table = pq.read_table("tpch/parquet/sf100/2025-07.parquet")
df = table.to_pandas()
print(df[df["final_status"] == "SUCCESS"].groupby("platform")["runtime"].mean())

License

This data is published by Wherobots. See the main benchmark-dashboard repository for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wherobots Benchmark Results

Repository Structure

File Formats

Parquet (Full Schema)

CSV (Simplified Schema)

Metadata JSON

How Results Are Published

Reading the Data

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Wherobots Benchmark Results

Repository Structure

File Formats

Parquet (Full Schema)

CSV (Simplified Schema)

Metadata JSON

How Results Are Published

Reading the Data

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages