Newy is a local-first news digestion tool that pulls content from trusted sources, uses a browser task agent plus Codex to navigate article pages, generates citation-backed digests, and can deliver them to WhatsApp through Twilio.
- Trusted-source ingestion from RSS, archive pages, and newsletter archive pages
- Browser task agent for JS-heavy sites and multi-step page navigation
- Codex-guided decisions for bounded web navigation and digest generation
- Citation-backed summaries in English, Arabic, or bilingual output
- Local SQLite storage for articles, digests, deliveries, and source-run diagnostics
- Twilio WhatsApp integration for sandbox or production sending
- Admin dashboard for sources, users, schedules, and manual digest runs
- Newy reads source definitions from
data/sources.seed.json. - For RSS sources, it parses feed entries directly.
- For web-only sources, it opens pages in a browser when needed, extracts bounded navigation candidates, and asks Codex which actions to take next.
- It validates article pages before saving them to SQLite.
- It clusters and ranks recent articles, then asks Codex to generate a grounded digest.
- It stores the digest and optionally sends it to WhatsApp via Twilio.
newy/
├── newy/ # application package
│ ├── browser_fetcher.py
│ ├── cli.py
│ ├── config.py
│ ├── delivery.py
│ ├── feed_fetcher.py
│ ├── models.py
│ ├── navigation_agent.py
│ ├── page_extractors.py
│ ├── ranking.py
│ ├── services.py
│ ├── source_catalog.py
│ ├── storage.py
│ ├── summarizer.py
│ └── web.py
├── data/
│ ├── config.example.json
│ └── sources.seed.json
├── scripts/
│ └── install_browser_support.sh
├── tests/
├── pyproject.toml
└── README.md
- Python 3.11+
codexinstalled and authenticated if you want Codex-based navigation/summarization- Internet access for live ingestion
Optional:
- Playwright + Chromium for browser-rendered navigation
- Twilio WhatsApp credentials for real message delivery
cd newy
./scripts/install_browser_support.sh
source .venv/bin/activateThis script creates a local virtual environment, installs the project with browser extras, installs Chromium for Playwright, and verifies the browser runtime.
cd newy
python3 -m venv .venv
source .venv/bin/activate
pip install -e .[browser]
playwright install chromiumIf you do not need browser rendering, you can use:
pip install -e .Copy the example config and create a local override:
cp data/config.example.json data/config.local.jsondata/config.local.json is intentionally ignored by git.
"codex": {
"enabled": true,
"command": "codex"
}"browser": {
"enabled": true,
"engine": "chromium",
"headless": true,
"timeout_seconds": 30,
"wait_until": "networkidle"
}"max_navigation_actions_per_page": 6,
"max_navigation_retries_per_page": 2For local dry-run testing:
"twilio": {
"from_number": "whatsapp:+14155238886",
"dry_run": true,
"validate_signature": false
}For real sending, set dry_run to false and export credentials:
export TWILIO_ACCOUNT_SID="your_sid"
export TWILIO_AUTH_TOKEN="your_token"Source definitions live in data/sources.seed.json.
Supported source types:
rssarchive_pagenewsletter_archive
Example archive/newsletter metadata:
{
"link_prefixes": ["/news/"],
"exclude_contains": ["/video/", "/photos/"],
"max_links": 8,
"max_navigation_steps": 2,
"max_navigation_actions": 6,
"max_navigation_retries": 2,
"use_browser": true,
"allow_heuristic_fallback": true
}python3 -m newy --config data/config.local.json init-dbpython3 -m newy --config data/config.local.json seed-demopython3 -m newy --config data/config.local.json ingest --forcepython3 -m newy --config data/config.local.json digest --user-id 1 --topic "US Iran conflict"python3 -m newy --config data/config.local.json serve-adminIf admin_token is set, open:
http://127.0.0.1:8080/?token=YOUR_ADMIN_TOKEN
python3 -m newy --config data/config.local.json workerNewy uses Twilio WhatsApp for delivery.
- Create a Twilio account.
- Open the WhatsApp Sandbox in Twilio Console.
- Join the sandbox from your phone by sending the displayed join code to the sandbox number.
- Export credentials:
export TWILIO_ACCOUNT_SID="your_sid"
export TWILIO_AUTH_TOKEN="your_token"- Set in
data/config.local.json:
"twilio": {
"from_number": "whatsapp:+14155238886",
"dry_run": false,
"validate_signature": false
}- Expose your local server publicly, for example with ngrok:
ngrok http 8080- In Twilio Sandbox settings, set the incoming webhook to:
https://YOUR_PUBLIC_URL/webhooks/twilio
- Start Newy admin + worker.
- Send a message such as:
digest Iran ceasefire
For production use:
- use an approved Twilio WhatsApp sender
- set
dry_runtofalse - set
validate_signaturetotrue - set
public_base_urlto your public HTTPS domain - point Twilio inbound webhook to
/webhooks/twilio
Newy stores source-run diagnostics in SQLite, including:
- source status
- visited pages
- attempted article URLs
- validated article URLs
- chosen navigation actions
- warnings/errors
This makes navigation failures easier to inspect during development.
Run tests:
python3 -m unittest discover -s tests -vCheck syntax:
python3 -m py_compile newy/*.pyThis repo is suitable for pilot/internal workflows, but some production-hardening issues remain, especially:
- shared SQLite connection with threaded HTTP is only locally hardened, not ideal for high concurrency
- webhook/admin handlers still do synchronous work
- admin token uses query-string/header auth rather than a proper session flow
Before publishing to GitHub, review:
data/sources.seed.jsonfor the source set you want publicdata/config.example.jsonto ensure no private values are present- Twilio and Codex usage notes to match your intended public documentation