Tip / Feature Request: Sliced Inference (Tiling) for High-Resolution Images

Hi @ankandrew! 

I've been using `fast-alpr` on high-resolution images (e.g., 16MP, 4624x3468) and noticed that the standard detection pass (even at 608px) can miss small or distant license plates because they become too small after the initial resize.

To solve this, I implemented a simple **Sliced Inference (Tiling)** approach that "zooms in" on quadrants of the image before running the detector. This increased the number of detected plates from **3 to 8** on a test image.

I thought this might be a useful tip for other users or a potential feature request for a `sliced_predict()` method.

**Example Implementation (Handling Frozen Dataclasses):**
```python
from fast_alpr import ALPR
import cv2
import numpy as np

# Use higher quality 's' model for better results on tiles
alpr = ALPR(
    detector_model="yolo-v9-s-608-license-plate-end2end",
    ocr_model="cct-s-v2-global-model",
    detector_conf_thresh=0.1
)

def tiled_predict(image_path, tiles_x=2, tiles_y=2, overlap=0.2):
    img = cv2.imread(image_path)
    h, w = img.shape[:2]
    
    all_results = []
    
    # Simple tiling with overlap
    for i in range(tiles_x):
        for j in range(tiles_y):
            # Calculate tile boundaries
            x1 = int(i * w * (1 - overlap) / (tiles_x - 1)) if tiles_x > 1 else 0
            y1 = int(j * h * (1 - overlap) / (tiles_y - 1)) if tiles_y > 1 else 0
            x2 = min(x1 + int(w / tiles_x * (1 + overlap)), w)
            y2 = min(y1 + int(h / tiles_y * (1 + overlap)), h)
            
            tile_img = img[y1:y2, x1:x2]
            results = alpr.predict(tile_img)
            
            for res in results:
                # Adjust coordinates for the full image (frozen dataclasses)
                bb = res.detection.bounding_box
                new_bb = bb.__class__(x1=bb.x1+x1, y1=bb.y1+y1, x2=bb.x2+x1, y2=bb.y2+y1)
                new_det = res.detection.__class__(label=res.detection.label, 
                                                 confidence=res.detection.confidence, 
                                                 bounding_box=new_bb)
                all_results.append(res.__class__(detection=new_det, ocr=res.ocr))
    
    # Deduplication based on bounding box centers
    final_results = []
    for res in all_results:
        c1 = ((res.detection.bounding_box.x1 + res.detection.bounding_box.x2) / 2, 
              (res.detection.bounding_box.y1 + res.detection.bounding_box.y2) / 2)
        is_duplicate = False
        for existing in final_results:
            c2 = ((existing.detection.bounding_box.x1 + existing.detection.bounding_box.x2) / 2, 
                  (existing.detection.bounding_box.y1 + existing.detection.bounding_box.y2) / 2)
            dist = np.sqrt((c1[0]-c2[0])**2 + (c1[1]-c2[1])**2)
            if dist < 50: # distance threshold for deduplication
                if res.detection.confidence > existing.detection.confidence:
                    final_results.remove(existing)
                    final_results.append(res)
                is_duplicate = True
                break
        if not is_duplicate:
            final_results.append(res)
            
    return final_results
```

Thanks for the great work on these libraries!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tip / Feature Request: Sliced Inference (Tiling) for High-Resolution Images #39

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Tip / Feature Request: Sliced Inference (Tiling) for High-Resolution Images #39

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions