AItom consists of three core layers:
- LLM-Based Structured Extraction
- Ontology-Aligned Knowledge Graph Construction
- Graph Retrieval-Augmented Generation (Graph RAG)
- Transformer-MLP Architecture for Safety Check
The system enables an end-to-end pipeline:
Raw Literature (PMID: 35614129)
↓
Ontology Design
↓
LLM Extraction
↓
Ontology Mapping
↓
Graph Database
↓
Graph Retrieval
↓
LLM Generation (Graph RAG) + Safety Check (Transformer-MLP)
- "Dataset of solution-based inorganic materials synthesis procedures extracted from the scientific literature"
- PMID: 35614129
- Protege software
Node
ChemicalEntity
InorganicMaterial
Precursor
Solvent
Media
Abrasive
Product
Additive
Process
SynthesisMethod
SynthesisStep
ConditionSet
Condition
Edge
usesPrecursor (SynthesisStep → Precursor)
usesSolvent (SynthesisStep → Solvent)
producesProduct (SynthesisStep → Product)
usesAdditive (SynthesisStep → Addictive)
usesMedia (SynthesisStep → Media)
usesAbrasive (SynthesisStep → Abrasive)
hasSynthesisMethod (InorganicMaterial → SynthesisMethod)
performedUnder (SynthesisStep → Condition)
nextStep (SynthesisStep → SynthesisStep)
consistOfStep (SynthesisMethod → SynthesisStep)
hasName(ChemicalEntity → xsd:string)
hasAcronym(InorganicMaterial → xsd:string)
hasPhase(InorganicMaterial → xsd:string)
isOxygenDeficiency(InorganicMaterial → xsd:float)
hasReaction(InorganicMaterial → xsd:string)
hasID (SynthesisMethod → xsd:integer)
hasTemperature (Condition → xsd:string)
hasTime (Condition → xsd:string)
haspH (Condition → xsd:string)
hasPressure (Condition → xsd:string)
hasAction(SynthesisStep → xsd:string)
hasNote (SynthesisStep → xsd:string)
- Transformer + MLP Architecture
- Transformer: CrabNet
pick top12 properties (LightGBM using)
↓
12 checkpoints of CrabNet loading
↓
concat 12 x embedding vector to single embedding vector
↓
MLP Design
↓
Safe / Unsafe Prediction

