This repository contains the data preparation scripts used to build a face-searchable photo website.
These scripts are intended to be run locally, once, in sequence.
The resulting outputs (faces.db and output/) are then used by a
separate Flask web application.
Pipeline:
- Generate face embeddings database from local photos
- Generate thumbnails for fast web display
- Fetch Google Drive shareable links using rclone
- Parse rclone JSON output
- Inject Drive URLs into the face database
- Python 3.8+
- OpenCV
- face_recognition (dlib)
- NumPy
- SQLite
- rclone (configured with Google Drive)
python 01_generate_face_db.py
What it does:
Recursively scans all local photo folders
Detects all faces in every image
Stores:
image path
face embedding
Output:
faces.dbpython 02_generate_thumbnails.py
What it does:
Creates resized thumbnails for each photo
Preserves folder structure
Output:
output/ directoryFirst, configure rclone:
rclone config
Then run:
bash 03_get_drive_links.sh
What it does:
Uses rclone lsjson
Produces a JSON listing of all Drive files and pathspython 04_parse_rclone_json.py
What it does:
Converts rclone JSON output into clean
local-path → Google Drive URL mappingsBefore running this step, add a new column to the database:
ALTER TABLE faces ADD COLUMN drive_url TEXT;
Then run:
python 05_inject_drive_links.py
What it does:
Matches local image paths with Drive URLs
Injects URLs into faces.db
Final Outputs
After completing all steps, you will have:
faces.db
face embeddings
image paths
Google Drive URLs
output/
thumbnails for web display
These outputs are consumed by the web application, which lives in a
separate folder/repository.This section explains how to deploy the Flask web application using the
pre-generated data (faces.db and thumbnails) on Railway.
The face indexing scripts are intended to be run locally. Only the web application is deployed to Railway.
Before deploying, ensure you have:
faces.dbgenerated by the scriptsoutput/directory containing thumbnails- A working Flask web application
- Docker installed locally
- A Railway account
- A GitHub account (for GitHub Container Registry)
Your web app directory should look similar to this:
web/
├── app.py
├── wsgi.py
├── Dockerfile
├── requirements.txt
├── faces.db
├── output/
│ └── ...
├── templates/
│ ├── index.html
│ └── login.html
faces.db and output/ are copied from the script outputs.
The Docker image must be built locally due to heavy dependencies. Railway will only run the image, not build it.
From inside the web app directory:
docker build -t ghcr.io/<your-username>/face-recog:latest .Login to GHCR
Create a GitHub Personal Access Token with the following scopes:
write:packages
read:packages
Then authenticate:
echo YOUR_GITHUB_TOKEN | docker login ghcr.io -u <your-username> --password-stdin
Push the Image
docker push ghcr.io/<your-username>/face-recog:latest
After pushing, set the package visibility to Public in GitHub → Packages → Package Settings. 3. Deploy on Railway
Open the Railway dashboard
Create a new project
Choose Deploy from Docker Image
Enter the image URL:
ghcr.io/<your-username>/face-recog:latest
Railway will pull and run the image directly.
In the Railway service settings, add the following variables:
APP_PASSWORD=your_login_password
SECRET_KEY=some_long_random_string
Both variables are required for the app to start.
After deployment completes, Railway assigns a public URL:
https://<service-name>.up.railway.app
The URL can be found under:
Service → Settings → Domains
Open the URL, log in, and upload a photo to find matching images.
Only one face search is processed at a time to ensure stability
High-resolution uploads are automatically downscaled server-side
Google Drive serves full-quality images when thumbnails are clicked
Designed for private, not public-scale traffic
After making changes to the web app:
docker build -t ghcr.io/<your-username>/face-recog:latest .
docker push ghcr.io/<your-username>/face-recog:latest
Then redeploy the service in Railway