Over the past years, Inference has evolved from a lightweight prediction server into a widely adopted runtime powering local deployments, Docker workloads, edge devices, and production systems. After hundreds of releases, the project has matured — and so has the need for something faster, more modular, and more future-proof.

inference 1.0.0 closes one chapter and opens another. This release introduces a new prediction engine that will serve as the foundation for future development.

⚡ New prediction engine: `inference-models`

We are introducing inference-models, a redesigned engine to run models focused on:

faster model loading and inference
improved resource utilization
better modularity and extensibility
cleaner separation between serving and model runtime
support from different backends - including TensorRT

Important

With inference 1.0.0 we released also first stable build of inference-models 0.19.0. You can use the engine in inference - just set env variable USE_INFERENCE_MODELS=True

Caution

The new inference-models engine is wrapped with adapters - to serve as dropdown replacement for old engine. We are making it default engine on Roboflow platform, but clients running inference locally have the USE_INFERENCE_MODELS set to False by default. We would like all clients to test the new engine - when the flag is not set, inference works as usually.
In approximately 2 weeks, with inference 1.1.0 release - we will make inference-models default engine for everyone.

Caution

inference-models is completely new backend, we've fixed a lot of problems and bugs. As a result - predictions from your model may be different - but according to our tests, quality-wise they are better. That being said, we still may have introduced some minor bugs - please report us any problems - we will do our best to fix problems 🙏

🛣️ Roadmap

Todays release is just a start for broader changes in inference - the plan for the future is the following:

shortly after release, we will complete our work around Roboflow platform - including migration of small fraction of models not onboarded into new registry used by inference-models and adjusting automations on the platform - until finished, clients who very recently uploaded or renamed models may be impacted by HTTP 404 - contact us to receive support in such cases.
there will be consecutive hot-fixes (if needed) - released as 1.0.x versions.
clients running inference locally should test inference-models backend now, as in approximately 2 weeks, inference-models will become default engine
We have still some work to do in 1.x.x - mainly to provide patches - but we start a march towards 2.0, which should bring new quality for other components of inference - stay tuned for updates.
You should expect that new contributions to inference will be based on inference-models engine and may not work if you don't migrate.

Caution

One of the problem we have not addressed in 1.0.0 is models cache purge - new inference-models engine uses different structure of the local cache than old engine. As a result - inference server with USE_INFERENCE_MODELS=True does not perform clean-up on volume with models pulled from the platform. If you run locally, generally that should not be an issue, since we expect clients only use limited number of different models in their deployments.
If you use large amount of models or when your disk space is tight, running new inference you should perform periodic clean-ups of /tmp/cache. This issue will be addressed before 1.1.0 release.

🎨 Semantic Segmentation in `inference`

Thanks to @leeclemnet, DeepLabV3Plus segmentation model was onboarded to inference and can be used by clients.

📐 Area Measurement block 🤝 Workflows

Thanks to @jeku46 we can now measure area size with Workflows.

🚧 Maintanence

add missing ffmpeg package for dev by @rafel-roboflow in #2009
fix expose sam3 with proper envs by @rafel-roboflow in #2011
Detections Class Replacement support for strings by @Erol444 in #2000
fix: Send termination_reason via data channel on WebRTC stream timeout by @balthazur in #2008
Remove content length validation to allow for chunked responses by @dkosowski87 in #2015
added processing_timeout support to webrtc's StreamConfig dataclass by @Erol444 in #2017
fix: Return 400 instead of 500 for raw bytes sent as base64 image by @bigbitbus in #2016
Added claude sonnet 4.6 by @Erol444 in #2014
Fix mkdocs-macros Jinja2 syntax errors in generated block docs by @yeldarby in #2012
Add remote GPU processing time collection and forwarding by @hansent in #2007
Add semantic-segmentation endpoints + deep_lab_v3_plus by @leeclemnet in #2018
Update CODEOWNERS: Add dkosowski87 and reorganize team assignments by @hansent in #2021
Add support for gemini 3.1 pro in gemini block by @Erol444 in #2024
Add area_measurement workflow block by @jeku46 in #2013
Auto-detect Jetson JetPack version in CLI server start by @alexnorell in #1958
Ged rid of unstable assertions on predictions in e2e tests by @PawelPeczek-Roboflow in #2026
ENT-884: Add workflow_version_id support to inference pipeline by @NVergunst-ROBO in #2022
Add JetPack 7.1 support for NVIDIA Thor by @alexnorell in #1935

🏅 New Contributors

@dkosowski87 made their first contribution in #2015
@leeclemnet made their first contribution in #2018

Full Changelog: v0.64.8...v1.0.0

Contributors

hansent, yeldarby, and 10 other contributors

Assets 4

13 Feb 20:44

grzegorz-roboflow

v0.64.8

2fa30cf

v0.64.8

💪 Added

Fisheye cameras in camera calibration block by @Erol444 in #1996
Calibration block was supporting polynomial calibration which is not handling fisheye distortions well. This change adds support for fisheye calibration.

Heatmap block by @Erol444 in #1986
This change adds heatmap block (uses supervision's heatmap annotator), which supports both:

detections, so heatmap based on where detections were
tracklets, which ignores stationary objects (default: on), so we heatmap the movements not the objects

heatmap2.mp4

🚧 Maintanence

temporarily pin z3-solver version by @grzegorz-roboflow in #1990
Code workflow block icon issue by @Erol444 in #1988
Optimize cosine_similarity by @KRRT7 in #1989
add inference version to the request headers by @japrescott in #1985
Fix video frame count estimation by detecting actual FPS from uploaded video by @rafel-roboflow in #1992
Mark file processing in webrtc worker for downstream blocks to pick frame timestamp correctly by @grzegorz-roboflow in #1995
add frame size to webrtc video metadata by @rafel-roboflow in #1997
enable gzip compression by default by @rafel-roboflow in #1998
WIP: enabled sam3 visual segment by @rafel-roboflow in #1975
added ffmpeg to docker dependencies by @rafel-roboflow in #2002
rename seg-preview to sam3 by @rafel-roboflow in #2005
Fix RF-DETR-Seg mask postprocessing for letterboxed input case by @mkaic in #2001
Enable inference pipeline api on jetpack 6.2.0 by @grzegorz-roboflow in #2006

Full Changelog: v0.64.7...v0.64.8

Contributors

japrescott, Erol444, and 4 other contributors

Assets 3

06 Feb 18:45

PawelPeczek-Roboflow

v1.0.0rc1

9de60d5

v1.0.0rc1

`inference 1.0.0rc1` — Release Candidate

Today marks an important milestone for Inference.

Over the past years, Inference has grown from a lightweight prediction server into a widely adopted runtime used across local deployments, Docker, edge devices, and production systems. Hundreds of releases later, the project has matured significantly — and so has the need for a faster, more modular, and future-proof.

inference 1.0.0rc1 is a preview of 1.0.0 release which will close one chapter and open another - this release introduces a new prediction engine that will become the foundation for all future development.

🚀 New prediction engine - `inference-models`

We are introducing inference-models, a redesigned execution engine focused on:

faster model loading and inference
improved resource utilization
better modularity and extensibility
cleaner separation between serving and model runtime
stronger foundations for future major versions

The engine is already available today in:

inference-models package → 0.18.6rc8 (RC)
inference package and Docker → enabled with env variable

USE_INFERENCE_MODELS=True

inference-models wrapped within old inference is a drop-down replacement. This allows testing the new runtime without changing existing integrations.

Important

Predictions from your models may change - but generally for better! inference-models is completely new engine for running models, we have fixed a lot of bugs and make it multi-backend - capable to run onnx, torch and even trt models! It automatically negotiate with Roboflow model registry to choose best package to run in your environment. We have already migrated almost all Roboflow models to new registry - working hard to achieve full coverage soon!

📅 What happens next

Next week
- Stable Inference 1.0.0
- Stable inference-models release
- Roboflow platform updated to use inference-models as the default engine
In the coming weeks
- inference-models becomes the default engine for public builds (USE_INFERENCE_MODELS becomes opt-out, not opt-in)
- continued performance improvements and runtime optimizations

🔭 Looking forward - the road to `2.0`

This engine refresh is only the first step.
We are starting work toward Inference 2.0, a larger modernization effort similar in spirit to the changes introduced with inference-models.

Stay tuned for future updates!

Assets 4

06 Feb 18:43

PawelPeczek-Roboflow

v0.64.7

9ef3731

v0.64.7

What's Changed

dg 15 fix timeout file by @rafel-roboflow in #1934
Fix VLM as Detector/Classifier name, so it gets correct URL by @Erol444 in #1965
improve error logging by @japrescott in #1966
Add rfdetr nas by @probicheaux in #1970
added missing envvar export for webrtc preview gzip flag by @rafel-roboflow in #1978
Bug/dg 204 (2) reduce ack window webrtc by @rafel-roboflow in #1979
Claude opus 4.6 in Claude block by @Erol444 in #1980
Add remote exec capability for foundation models missing it by @hansent in #1968
Gemini block: Add support for tool code execution (tool use) by @Erol444 in #1961
Pass delete from disk to clear cache by @bigbitbus in #1982
Add change to avoid pushing latest tag for rc release by @PawelPeczek-Roboflow in #1983

Full Changelog: v0.64.6...v0.64.7

Contributors

hansent, japrescott, and 5 other contributors

Assets 4

30 Jan 20:16

grzegorz-roboflow

v0.64.6

17244cd

v0.64.6

What's Changed

Add large rf-detrs and seg coco models to inference_models by @Matvezy in #1944
Add configurable RF API timeout for inference-cli command interacting with RF-cloud by @PawelPeczek-Roboflow in #1950
Allow sv.Detections.data properties in extract-property block by @grzegorz-roboflow in #1948
CI for deploying custom python block modal app by @grzegorz-roboflow in #1945
in modal.custom_python_block.deploy.yml use deployment modal tokens by @grzegorz-roboflow in #1953
Address internals imported from Supervision by @grzegorz-roboflow in #1951
Add YOLO26 to inference_models by @mkaic in #1943
Fix failing yolo26 gpu integration tests by @mkaic in #1956
Disable automatic deployment of modal webexec by @grzegorz-roboflow in #1954
0.64.6 by @grzegorz-roboflow in #1957

Full Changelog: v0.64.5...v0.64.6

Contributors

Matvezy, mkaic, and 2 other contributors

Assets 4

23 Jan 18:04

grzegorz-roboflow

v0.64.5

a44922a

v0.64.5

What's Changed

Fix 2 urls by @Erol444 in #1939
Reduce number of layers in lambda dockerfile by @grzegorz-roboflow in #1940
Add xl rfdetrs types by @Matvezy in #1937
Add collimate algorithm and Otsu improvements to OCR stitch block by @reiffd7 in #1936
Fix SAM3 code sample by @Erol444 in #1941
Feature/dg 3 add preview flag to video streaming previews by @rafel-roboflow in #1942

New Contributors

@Erol444 made their first contribution in #1939

Full Changelog: v0.64.4...v0.64.5

Contributors

Erol444, reiffd7, and 3 other contributors

Assets 4

22 Jan 13:28

grzegorz-roboflow

v0.64.4

c271c8f

v0.64.4

What's Changed

Address numpy.fromstring deprecation by @grzegorz-roboflow in #1925
fix documentation issue with special chars by @rafel-roboflow in #1926
Fix/dg 18 rotation metadata for videos is not applied serverside by @rafel-roboflow in #1911
First iteration of inference-models docs by @PawelPeczek-Roboflow in #1922
Action install ffmpeg modal by @rafel-roboflow in #1928
added search keywords metadata to image preprocessing block by @rafel-roboflow in #1931
Cache describe_interface and other endpoints by @yeldarby in #1932
Add rfdetr large, xl, xxl, and seg aliases by @Matvezy in #1933
Safeq qwen test by @Matvezy in #1900
Enable crash dump of input image by @grzegorz-roboflow in #1921
Feature/event log block improvements by @jeku46 in #1929
Selectively disabling workflow blocks from inference server by @bigbitbus in #1924

Full Changelog: v0.64.3...v0.64.4

Contributors

yeldarby, bigbitbus, and 5 other contributors

Assets 4

16 Jan 18:24

PawelPeczek-Roboflow

v0.64.3

0baa84c

v0.64.3

What's Changed

Fix yolo26 XL Seg aliases by @Matvezy in #1919
Fix rfdetr instance segmentation and object detection postprocessing error by @grzegorz-roboflow in #1920
Add docs about YOLO26 by @PawelPeczek-Roboflow in #1923

Full Changelog: v0.64.2...v0.64.3

Contributors

Matvezy, PawelPeczek-Roboflow, and grzegorz-roboflow

Assets 4

15 Jan 18:03

PawelPeczek-Roboflow

v0.64.2

0d05452

v0.64.2

What's Changed

Add more yolo26 aliases by @Matvezy in #1918

Full Changelog: v0.64.1...v0.64.2

Contributors

Matvezy

Assets 4

Releases: roboflow/inference

v1.0.1

What's Changed

Contributors

Uh oh!

v1.0.0

🚀 Added

💪 inference 1.0.0 just landed 🔥

⚡ New prediction engine: inference-models

🛣️ Roadmap

🎨 Semantic Segmentation in inference

📐 Area Measurement block 🤝 Workflows

🚧 Maintanence

🏅 New Contributors

Contributors

Uh oh!

v0.64.8

💪 Added

🚧 Maintanence

Contributors

Uh oh!

v1.0.0rc1

inference 1.0.0rc1 — Release Candidate

🚀 New prediction engine - inference-models

📅 What happens next

🔭 Looking forward - the road to 2.0

Uh oh!

v0.64.7

What's Changed

Contributors

Uh oh!

v0.64.6

What's Changed

Contributors

Uh oh!

v0.64.5

What's Changed

New Contributors

Contributors

Uh oh!

v0.64.4

What's Changed

Contributors

Uh oh!

v0.64.3

What's Changed

Contributors

Uh oh!

v0.64.2

What's Changed

Contributors

Uh oh!

💪 `inference 1.0.0` just landed 🔥

⚡ New prediction engine: `inference-models`

🎨 Semantic Segmentation in `inference`

`inference 1.0.0rc1` — Release Candidate

🚀 New prediction engine - `inference-models`

🔭 Looking forward - the road to `2.0`