Skip to content

Tip / Feature Request: Sliced Inference (Tiling) for High-Resolution Images #39

@Manamama-Gemini-Cloud-AI-01

Description

Hi @ankandrew!

I've been using fast-alpr on high-resolution images (e.g., 16MP, 4624x3468) and noticed that the standard detection pass (even at 608px) can miss small or distant license plates because they become too small after the initial resize.

To solve this, I implemented a simple Sliced Inference (Tiling) approach that "zooms in" on quadrants of the image before running the detector. This increased the number of detected plates from 3 to 8 on a test image.

I thought this might be a useful tip for other users or a potential feature request for a sliced_predict() method.

Example Implementation (Handling Frozen Dataclasses):

from fast_alpr import ALPR
import cv2
import numpy as np

# Use higher quality 's' model for better results on tiles
alpr = ALPR(
    detector_model="yolo-v9-s-608-license-plate-end2end",
    ocr_model="cct-s-v2-global-model",
    detector_conf_thresh=0.1
)

def tiled_predict(image_path, tiles_x=2, tiles_y=2, overlap=0.2):
    img = cv2.imread(image_path)
    h, w = img.shape[:2]
    
    all_results = []
    
    # Simple tiling with overlap
    for i in range(tiles_x):
        for j in range(tiles_y):
            # Calculate tile boundaries
            x1 = int(i * w * (1 - overlap) / (tiles_x - 1)) if tiles_x > 1 else 0
            y1 = int(j * h * (1 - overlap) / (tiles_y - 1)) if tiles_y > 1 else 0
            x2 = min(x1 + int(w / tiles_x * (1 + overlap)), w)
            y2 = min(y1 + int(h / tiles_y * (1 + overlap)), h)
            
            tile_img = img[y1:y2, x1:x2]
            results = alpr.predict(tile_img)
            
            for res in results:
                # Adjust coordinates for the full image (frozen dataclasses)
                bb = res.detection.bounding_box
                new_bb = bb.__class__(x1=bb.x1+x1, y1=bb.y1+y1, x2=bb.x2+x1, y2=bb.y2+y1)
                new_det = res.detection.__class__(label=res.detection.label, 
                                                 confidence=res.detection.confidence, 
                                                 bounding_box=new_bb)
                all_results.append(res.__class__(detection=new_det, ocr=res.ocr))
    
    # Deduplication based on bounding box centers
    final_results = []
    for res in all_results:
        c1 = ((res.detection.bounding_box.x1 + res.detection.bounding_box.x2) / 2, 
              (res.detection.bounding_box.y1 + res.detection.bounding_box.y2) / 2)
        is_duplicate = False
        for existing in final_results:
            c2 = ((existing.detection.bounding_box.x1 + existing.detection.bounding_box.x2) / 2, 
                  (existing.detection.bounding_box.y1 + existing.detection.bounding_box.y2) / 2)
            dist = np.sqrt((c1[0]-c2[0])**2 + (c1[1]-c2[1])**2)
            if dist < 50: # distance threshold for deduplication
                if res.detection.confidence > existing.detection.confidence:
                    final_results.remove(existing)
                    final_results.append(res)
                is_duplicate = True
                break
        if not is_duplicate:
            final_results.append(res)
            
    return final_results

Thanks for the great work on these libraries!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions