Job Seek

Note

The scrapers in this repo are not maintained. If you'd like better, faster, stronger scrapers, check out my latest project:

<job-seek> (web/code)

This is an open-source job aggregator that allows you to follow any company with a publicly available job board. Don't see the company you are interested in on the app? Just request to add it in the UI, and a coding agent will handle it in less than 5 minutes. There are paid features, but you can always deploy the app yourself or take the implementation of the scrapers and build your own UI around them. Happy job seeking!

Job Seek

Job-seek is a Streamlit app that scrapes company career pages and ATS job boards, normalizes the results, and shows them in a lightweight dashboard with a “newest jobs” overview. It uses a general extractor pipeline (anchors, JSON-LD, repeated blocks, pagination/normalization) and falls back to headless JS rendering when a page is a JS shell. A failure-resistant data model tracks attempts and health to avoid flakiness across runs.

Core features

General scraping pipeline. Generic extractors (anchor, JSON-LD, list items, repeated block patterns) + pagination & URL normalization aim to work across most job boards.
JS-rendered pages support. If a page looks like a JS shell, the scraper fetches the rendered HTML as a fallback.
Failure resistance. The data model keeps per-board scrape attempts/health and merges results conservatively to reduce bad updates when the scrapes fail.
Dashboard UI. A Streamlit interface renders job cards and a simple page to browse results — including a newest-first overview.

Custom scrapers (when general rules aren’t enough)

Most sites work with the general pipeline, but some require dedicated adapters. Currently implemented:

Lever
Meta Careers
Microsoft Careers
Proton (Greenhouse, with CH-focused location terms)
Workday
Join
Greenhouse
Ashby

These are wired via the custom adapters registry.

Running the app

# 1) Install deps (recommended: use a venv)
python -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# 2) Start the UI
streamlit run app.py

The directory job_seek_seed includes pre-made pages configurations for scraping.

Using the job_seek_seed presets (optional, recommended)

The directory job_seek_seed contains pre-made page configurations. Normally, you would populate the list of pages to scrape yourself (via the UI or by adding configs under data/pages). To get started fast, use a curated preset:

job_seek_seed/mle_swe_at_switzerland focuses on the Swiss MLE/SWE job market.

Copy the preset pages into your data folder:

macOS/Linux:

mkdir -p data/pages
cp -r job_seek_seed/mle_swe_at_switzerland/pages/* data/pages/

Windows (PowerShell):

New-Item -ItemType Directory -Force -Path data\pages | Out-Null
Copy-Item -Recurse -Force job_seek_seed\mle_swe_at_switzerland\pages\* data\pages\

Then run the app. The dashboard will load these pages and show a newest jobs overview. You can extend or replace the list by adding your own page configs under data/pages or through the UI.

Contributing

All contributions are welcome. Bug fixes, docs, small tweaks—everything helps.
Scrapers need constant care. New custom scrapers and updates to existing ones (when page structures change) are especially valuable.
Seed lists matter. Please add new seeds (e.g., curated page lists) and refresh older seeds to keep coverage complete and current.

Quick guide: fork → create a feature branch → commit with a clear message → open a PR describing what changed and how to test it.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
job_seek_seed/mle_swe_at_switzerland/pages		job_seek_seed/mle_swe_at_switzerland/pages
scripts		scripts
services		services
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

<job-seek> (web/code)

Job Seek

Core features

Custom scrapers (when general rules aren’t enough)

Running the app

Using the job_seek_seed presets (optional, recommended)

Contributing

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

<job-seek> (web/code)

Job Seek

Core features

Custom scrapers (when general rules aren’t enough)

Running the app

Using the job_seek_seed presets (optional, recommended)

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages