Firebase Functions backend for LLMonFHIR with a RAG-enabled OpenAI-compatible /chat endpoint and a lightweight web client for comparison testing.
Note
functions/.secret.local must contain a valid OPENAI_API_KEY
cd functions
npm install
cd ..
sh run-emulator.shThis installs backend dependencies, builds functions, and starts Firebase emulators. Ensure the OPENAI_API_KEY secret is configured in your Firebase project before deploying.
cd web
npm install
npm run devThe web UI compares responses with RAG enabled and disabled using mock FHIR tool results.
Realtime chat path for LLMonFHIR / web client
→ OpenAI-compatible Firebase Function `/chat`
(drop-in replacement for OpenAI `/v1/chat/completions`)
(client keeps the same request body; only the URL changes)
→ Retrieve top-k chunks (Genkit retriever)
→ Vector store (dev-local)
→ Augment system prompt
← Streamed response (+ optional RAG metadata)
Document ingestion path for Firebase Storage `rag_files/*.pdf`
→ `onPDFUploaded` trigger
→ PDF text extraction
→ Chunk + embed
→ Index into vector store
- File:
functions/src/functionImplementations/openai-proxy.ts - Firebase Function name:
chat - Endpoint:
https://<region>-<project>.cloudfunctions.net/chat - Drop-in OpenAI
/v1/chat/completionsreplacement: keep the same request body and only change the URL - Supports streaming and non-streaming OpenAI chat requests
- Injects retrieved RAG context into the system prompt
- Toggle RAG off for debugging:
?ragEnabled=false
- File:
functions/src/functionImplementations/storage-trigger.ts - Trigger: new PDF uploaded under
rag_files/in Storage - Extracts text, cleans and chunks it, and embeds content into the vector store
- Chunking:
functions/src/rag/chunker.ts - Indexing:
functions/src/rag/indexer.ts - Retrieval:
functions/src/rag/retriever.ts(top 5 chunks) - Genkit config:
functions/src/utils/genkit.ts
cd functions
npm install
npm run buildsh run-emulator.shfirebase deploy --only functionsfirebase functions:secrets:set OPENAI_API_KEYcurl -X POST "https://<firebase-functions-url>/chat" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{ "role": "user", "content": "Summarize the latest lab results." }
],
"stream": true
}'When RAG is enabled and context is found, the stream includes a metadata event before the usual OpenAI deltas:
{
"type": "rag_context",
"context": "[Document: ...]",
"contextLength": 1234,
"enabled": true
}The non-streaming response includes _ragContext:
{
"id": "...",
"choices": [ ... ],
"_ragContext": {
"context": "[Document: ...]",
"contextLength": 1234,
"enabled": true
}
}The /web directory contains a minimal React app that compares responses with RAG enabled and disabled. It routes OpenAI SDK calls to the Firebase Functions /chat endpoint and uses mock FHIR tool outputs to simulate data retrieval.
- The RAG vector store uses
@genkit-ai/dev-local-vectorstorefor development only. - For production, replace it with a persistent vector store such as Firestore Vector Search: https://genkit.dev/docs/integrations/cloud-firestore/