This project implements a visual search system for identifying landmarks from images. It was developed as part of a computer vision assignment using a dataset adapted from the Google Landmarks Dataset.
The goal is to determine whether two images belong to the same landmark or, given a query image, retrieve the k most similar images from a reference set.
The dataset contains 100 images across 20 landmarks (e.g., Sydney Opera House, Golden Gate Bridge) with variations in scale, pose, and lighting.
Two main tasks were completed:
- Compute embeddings for query and reference images using a pretrained CNN.
- Measure similarity with a distance metric.
- Rank the k most similar images.
- Compute retrieval performance metrics.
- Choose a CNN architecture (same as Task 1).
- Train with contrastive loss or triplet loss using the PyTorch Metric Learning library.
- Evaluate retrieval results and compare against Task 1.
Install dependencies with:
pip install -r requirements.txt- Retrieval performance with pretrained CNN embeddings
- Improved results after training custom embeddings with contrastive/triplet loss