This project implements a full digital thread intelligence system combining:
- Neo4j knowledge graph (assemblies → parts → specs → embeddings)
- Hybrid RAG retrieval engine
- Graph-aware LLM reasoning
- Compatibility scoring model for mechanical components
- FastAPI backend
- React frontend
- YAML ingestion + document storage
It supports natural-language engineering queries, part replacement suggestions, compatibility evaluation, and intelligent search across manufacturing assemblies.
This is essentially a mini industrial PLM + RAG + graph system.
-
Each product is a digital entity in Neo4j.
-
Stores:
- Name, SKU, description
- Embedding (for product-level semantic retrieval)
Example: Industrial Lathe Machine
Automatically generated from part categories using an ASSEMBLY_MAP, e.g.:
- Spindle Assembly
- Z Axis Assembly
- X Axis Assembly
- Tailstock Assembly
- Mold System Assembly
- Electronics Assembly
Assemblies are shared across parts and products.
Graph structure:
(Product) —[:HAS_ASSEMBLY]→ (Assembly) —[:HAS_PART]→ (Part)
Each part stores:
part_id,name,category,description- Embedding (384-D) using gte-small
- Specs (thread size, diameter, pitch, torque, etc.)
- Children for hierarchical structure (subcomponents)
Graph:
(Part) —[:HAS_CHILD]→ (Part)
(Part) —[:HAS_SPEC]→ (Spec)
Specs are stored as:
(key, value, unit)
With uniqueness enforced via a NODE KEY constraint.
This allows powerful spec-level queries like:
MATCH (p:Part)-[:HAS_SPEC]->(s:Spec {key:"pitch", value:5})
User uploads a YAML file:
product:
name: Industrial Lathe Machine
description: ...
parts:
- part_id: SPINDLE-MT5-38HOLE
category: Spindle
specs:
- key: "thread"
value: "M45"
The ingestor:
- Creates/updates product
- Creates assemblies
- Creates part nodes
- Embeds each part's name + description
- Stores specs (forcing non-null units)
- Builds parent-child relationships recursively
This yields a consistent, hierarchical digital twin.
This is one of the most advanced parts of the system.
Using the same 384-D embedding model as parts.
Two searches run in parallel:
CALL db.index.vector.queryNodes(
'part_embedding_index',
$k,
$embedding
)
Retrieves semantically relevant parts even if keywords are missing.
CALL db.index.fulltext.queryNodes(
'part_fulltext_idx',
$query
)
Retrieves keyword matches with fuzzy ranking.
We enforce digital-thread scoping:
- Only show parts that belong to the selected product
- Or selected assembly
We combine vector + keyword results, keeping only the highest score per part_id.
We fetch:
- Specs
- Product associations
- Assembly placement
If retrieved parts are known compatible, we display:
A ↔ B: score 0.73 — pitch matches; same assembly; compatible torque range
The LLM receives:
- Top retrieved parts
- Graph structure
- Specs
- Compatibility edges
It generates a structured, human-like engineering answer.
The compatibility model compares two parts (existing vs existing, or new vs existing) along four dimensions, each independently computed:
Checks:
- diameters
- lengths
- pitches
- threads
- torque ratings
- fits and tolerances
Scoring formula:
mechanical_score = weighted match of overlapping mechanical specs
Checks:
- part category
- assembly role
- operational purpose
- motion profile
- intended load path
E.g., two ballscrews functionally similar even if dimensions differ.
Embedding distance between part descriptions.
This is extremely useful when specs are missing.
Checks:
- Are both parts in the same assembly?
- Same subassembly?
- Do they share a parent/child?
Example:
Spindle Nut ↔ Spindle Shaft → HIGH
Ballscrew ↔ Tailstock → LOW
final_score = (0.35 * mechanical)
+ (0.25 * functional)
+ (0.25 * semantic)
+ (0.15 * hierarchy)
We also store:
explanations: ["same pitch", "same spindle assembly", ...]
When checking a new, never-before-seen part:
- LLM extracts structured specs from text
- Embedding for semantic comparison
- Filtering by product assemblies
- Compute compatibility score against all known parts
- Return ranked results + explanations
This is a digital twin–aware engineering recommender system.
No PLM today does this.
API routes include:
/api/query→ RAG reasoning/api/compat/product/{name}→ existing compatibility/api/compat/new-part→ new part scoring/api/upload/doc→ store BOMs/pdfs/images/api/upload/yaml→ ingest new products/api/stt→ Groq Whisper speech-to-text
Simple, clean, industry-ready.
Key features:
- Dark text + white background “ERP style” UI
- RAG Query Mode
- Existing Compatibility Mode
- New Part Compatibility Mode
- Upload Documents
- YAML ingestion
- Download Markdown report
Manufacturing data is usually siloed in:
- PLM
- MES
- ERP
- Excel BOMs
- Vendor PDFs
This system unifies them into a searchable digital thread.
Engineers frequently ask:
- “What part should I replace this with?”
- “Are these two components interchangeable?”
- “What does this assembly contain?”
No existing search engine can answer these without manual lookup.
This system can.
Most RAG systems are text-only. This system uses:
✔ Specs ✔ Assemblies ✔ Vector embeddings ✔ Hierarchical similarity ✔ Multi-factor compatibility
This yields far more accurate engineering answers.
No manufacturing platform (PTC Windchill, Siemens Teamcenter, Dassault 3DEXPERIENCE) currently offers:
- ML-driven compatibility
- Assembly-aware matching
- LLM-based explanation of engineering alternatives
This system does.
Adding new machines/products is as easy as uploading a YAML file.
This makes it infinitely scalable across:
- entire factories
- robotics systems
- CNC fleets
- automotive component trees