LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows
-
Updated
Jun 27, 2026 - Python
LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows
Universal page-to-data extractor, turn URLs or HTML into clean Markdown, JSON, or Text for RAG and AI workflows. Rust API + Tauri desktop app.
Turn Chaos Into Structure. A Type-Safe AI Agent that extracts valid JSON from unstructured data using PydanticAI, FastHTML, and Gemini 2.5.
Universal prompt library for structured outputs & ready-to-use content. Teachers get lesson plans, developers get reliable JSON/CSV. Works across GPT-4, Claude, Gemini.
n8n workflow templates extractor
Fine-tune Qwen3-0.6B for resume parsing using LoRA
Structured JSON extraction from LLMs with validation, repair, and streaming.
AI-powered structured web scraper that visually builds JSON schemas and uses Gemini 2.5 Flash & Playwright to extract clean, validated JSON with >80% DOM noise reduction.
Fine-tuned Qwen2.5-7B on Fireworks AI for structured JSON extraction from job postings. LoRA SFT + DPO | FastAPI | +47% over baseline.
Convert unstructured text to validated JSON using AI. Dynamic JSON Schema per request. Powered by Google Gemini. Available on RapidAPI.
Production-style fine-tuning project for schema-constrained JSON extraction using QLoRA + DPO, with reproducible evals, training curves, and vLLM benchmarks.
Turn handwritten forms, notes, and scanned paperwork into automation-ready JSON
A Json Analysis Tool
Professional-grade AI logistics pipeline built with Java 17 and Spring Boot 3. Converts unstructured documents into validated JSON via non-blocking WebClient adapters (Groq/Llama 3.1). Features real-time dashboard, Slack/Email notifications, and secure error handling."
Production QLoRA SFT+DPO pipeline & FastAPI inference server for schema-validated JSON extraction — runs on 4 GB VRAM
Document ingestion and chunking agent that extracts and validates typed JSON against a strict schema.
Extract valid JSON data from arbitrary text - a focused Go library and CLI tool
udemy scraper course data
Fine-tuning Qwen2-VL on Apple Silicon (MLX) for structured JSON document extraction.
Add a description, image, and links to the json-extraction topic page so that developers can more easily learn about it.
To associate your repository with the json-extraction topic, visit your repo's landing page and select "manage topics."