IceGraph provides an interactive, hierarchical view of Apache Iceberg metadata. It maps the DNA of your tables—from root metadata down to individual data and delete files.
Look at Live Demo! https://yanivzalach.github.io/IceGraph/
Opinionated Design: IceGraph is built exclusively for Spark Connect backends.
Table Version: Currently IceGraph officially supports Table Version 2.
- Read-Only: The application is read-only and does not modify the table.
- Time-Travel: View the physical state of your table as of any
datetime. - Metadata Inspector: Displaying record counts, stats, and file paths.
- Table History: Trace every metadata evolution, from schema changes to snapshot writes, across the full lifetime of the table.
- Table File Browser: See your table's files group by partition, just like you use to.
- Branches: View all the branches of the table, even when detached from the main branch.
Recommended: In production, use a user with read-only permissions for the Spark Connect server, for extra peace of mind.
Clone the repo, and in it, go to:
cd docker_demo
Run the docker compose:
docker compose up
Go to http://localhost:5000 and explore table default.events and table default.logging.
Recommended: Change the TIMEZONE variable in the docker compose to your timezone name.
The easiest way to run IceGraph is via DockerHub
docker run -e SPARK_REMOTE=sc://<spark-connect-ip>:15002 -e TIMEZONE=my/timezone -p 5000:5000 yanivzalach/icegraph:latestClone the repo, update the Spark Connect version in backend/pyproject.toml, then build from the project root:
docker build -t icegraph .Then run with the same command:
docker run -e SPARK_REMOTE=sc://<spark-connect-ip>:15002 -e TIMEZONE=my/timezone -p 5000:5000 icegraph- npm
- UV (python)
- Python 3.9
Sync the environments:
cd backend
uv synccd frontend
npm iWe will create an .env file in the root of the backend directory:
TIMEZONE=my/timezone # Put your local timezone name
SPARK_REMOTE=sc://localhost:15002 # Our local testing spark, If you use docker, change it to your ip.Open one terminal in the backend directory and run:
uv run python main.pyOpen a second terminal in the front end directory and run:
npm run devGo to http://localhost:3000 and explore your tables.
