-
Notifications
You must be signed in to change notification settings - Fork 83
Expand file tree
/
Copy pathexample.env
More file actions
606 lines (512 loc) · 41.2 KB
/
example.env
File metadata and controls
606 lines (512 loc) · 41.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
# ══════════════════════════════════════════════════════════════════════════════
# LLM
# ══════════════════════════════════════════════════════════════════════════════
# ── General ──────────────────────────────────────────────────────────────────
# Supports any OpenAI-compatible API (OpenAI, DeepSeek, Qwen, Ollama, etc.)
LLM_API_KEY=sk-your-api-key-here
LLM_BASE_URL=https://api.openai.com/v1
# Main model — used for planning, analysis, and ReAct agent
LLM_MODEL=gpt-4o
# Fast model — used for DAG step execution (cheaper, faster)
# Falls back to LLM_MODEL if not set
FAST_LLM_MODEL=gpt-4o-mini
# Default temperature for LLM calls
LLM_TEMPERATURE=0.7
# ── Fast Model (overrides — falls back to General if not set) ────────────────
# Use these to point the fast model at a different provider/key than General.
# FAST_LLM_API_KEY= # Falls back to LLM_API_KEY
# FAST_LLM_BASE_URL= # Falls back to LLM_BASE_URL
# FAST_LLM_TEMPERATURE= # Falls back to LLM_TEMPERATURE
# ── Reasoning Model (overrides — falls back to General if not set) ───────────
# Use these when the reasoning model is from a different provider or needs
# separate credentials. Reasoning effort/budget ARE enabled for this tier.
# REASONING_LLM_MODEL= # Falls back to LLM_MODEL
# REASONING_LLM_API_KEY= # Falls back to LLM_API_KEY
# REASONING_LLM_BASE_URL= # Falls back to LLM_BASE_URL
# REASONING_LLM_TEMPERATURE= # Falls back to LLM_TEMPERATURE
# REASONING_LLM_EFFORT= # Falls back to LLM_REASONING_EFFORT
# REASONING_LLM_BUDGET= # Falls back to LLM_REASONING_BUDGET_TOKENS
# ── Extended Thinking ────────────────────────────────────────────────────────
# Extended thinking / reasoning for supported models (OpenAI o-series, Gemini 2.5+, Claude)
# Values: low, medium, high (leave empty or unset to disable)
# LiteLLM translates this to each provider's native format automatically:
# OpenAI / Gemini → reasoning_effort parameter
# Anthropic → thinking parameter (via native API routing)
# When enabled, the model's reasoning_content is surfaced in the "thinking" step UI
# LLM_REASONING_EFFORT=
# Optional: explicit token budget for reasoning (primarily Anthropic; others derive from effort level)
# Anthropic minimum: 1024, must be less than LLM_MAX_OUTPUT_TOKENS
# LLM_REASONING_BUDGET_TOKENS=
# ── Context Window ───────────────────────────────────────────────────────────
# Fallback when no DB model config is set for the role.
# If a model provider is configured in Settings → Models with context_size /
# max_output_tokens, those DB values take priority over these ENV vars.
# The system auto-computes: input_budget = context_size - max_output - 4K reserve
# Default sweet spot: 128K context / 64K output → 60K input budget
# LLM_CONTEXT_SIZE=128000 # General model context cap (default: 128000)
# LLM_MAX_OUTPUT_TOKENS=64000 # General model max output tokens (default: 64000)
# FAST_LLM_CONTEXT_SIZE=128000 # Fast model (falls back to LLM_CONTEXT_SIZE)
# FAST_LLM_MAX_OUTPUT_TOKENS=64000
# REASONING_LLM_CONTEXT_SIZE= # Falls back to LLM_CONTEXT_SIZE
# REASONING_LLM_MAX_OUTPUT_TOKENS= # Falls back to LLM_MAX_OUTPUT_TOKENS
# ── Retry ──────────────────────────────────────────────────────────────────────
# Applies to ALL LLM / embedding / reranker calls (exponential backoff + jitter)
# LLM_MAX_RETRIES=3 # Max retry attempts on transient errors (429, 5xx, timeouts)
# LLM_RETRY_BASE_DELAY=1.0 # Initial backoff delay in seconds
# LLM_RETRY_MAX_DELAY=60.0 # Upper bound on backoff delay in seconds
# ── JSON Mode ────────────────────────────────────────────────────────────────
# Global toggle for response_format=json_object
# Set to false if the provider rejects LiteLLM's assistant prefill injection
# (e.g. AWS Bedrock via a relay: ValidationException on the 2nd+ agent iteration).
# When false, structured calls skip JSON mode and fall back to plain-text regex extraction.
# This is a system-level setting — applies to all models (ENV and Admin-configured).
# LLM_JSON_MODE_ENABLED=true
# ── Tool Choice ────────────────────────────────────────────────────────
# Global toggle for forced tool_choice in structured output (Level 1: Native FC).
# When false, structured_llm_call skips native function calling and falls back
# to JSON mode / plain text extraction. Does NOT affect ReAct agent tool usage.
# Useful for models that always fail forced tool_choice (e.g. Kimi K2.5 + thinking).
# Per-model override available in Settings → Models → tool_choice_enabled.
# LLM_TOOL_CHOICE_ENABLED=true
# ── MarkItDown Vision / OCR (Optional) ───────────────────────────────────────
# Used by the `convert_to_markdown` built-in tool AND the RAG ingestion
# pipeline when extracting text from files that contain embedded images
# (DOCX, PPTX, XLSX, PDF). When a vision-capable model is available, the
# markitdown-ocr plugin OCRs images automatically; otherwise the extractor
# falls back to text-only mode (identical to the pre-OCR behavior, zero
# regression).
#
# Resolution order (first match wins):
# 1. Agent primary model in DB (if supports_vision=True)
# 2. Active ModelGroup's fast model (preferred for OCR: cheap + low latency)
# 3. Active ModelGroup's general model (quality fallback)
# 4. ENV fallback: LLM_MODEL itself (assumes it supports vision)
#
# Step 4 is OPTIMISTIC — it assumes your ENV-configured LLM_MODEL supports
# vision. This is true for the common cases (gpt-4o, claude-3-5-sonnet,
# gemini-1.5-pro/flash, etc.) and means zero config for most users.
#
# SET LLM_SUPPORTS_VISION=false ONLY IF:
# - You're running FIM One in pure ENV mode (no Admin → Models configured)
# - AND your LLM_MODEL does NOT support vision (e.g. deepseek-v3, qwen-chat,
# llama-3.1, mistral-large-2, gpt-3.5-turbo, o1-mini)
#
# When set to "false", ENV-mode OCR attempts are skipped entirely so
# MarkItDown doesn't waste a round-trip on every document upload. DB-mode
# resolution is NOT affected — the admin's model group is always the source
# of truth when configured.
# LLM_SUPPORTS_VISION=true
# ══════════════════════════════════════════════════════════════════════════════
# Agent Execution
# ══════════════════════════════════════════════════════════════════════════════
# ── ReAct Agent ──────────────────────────────────────────────────────────────
# Max tool-call iterations per request (higher = more thorough but slower)
REACT_MAX_ITERATIONS=20
# REACT_MAX_TURN_TOKENS=0 # Emergency circuit-breaker: max cumulative tokens per
# # single ReAct turn (prompt + completion across ALL
# # iterations). Default 0 = unlimited.
# # ⚠️ This is NOT for daily token control — use the
# # per-user monthly token_quota for that. This is a
# # last-resort safety valve for extreme scenarios like
# # an agent stuck in an infinite tool-call loop. Hitting
# # this limit aborts the task mid-execution, wasting all
# # tokens consumed so far and returning an incomplete
# # result the user likely cannot use. Keep at 0 unless
# # you have a specific runaway-agent problem to contain.
# REACT_TOOL_SELECTION_THRESHOLD=12 # Tool count threshold to trigger smart selection
# REACT_TOOL_SELECTION_MAX=6 # Max tools to select in smart selection
# REACT_SELF_REFLECTION_INTERVAL=6 # Inject self-reflection every N tool calls
# REACT_TOOL_OBS_TRUNCATION=8000 # Max chars per tool observation in synthesis
# REACT_TOOL_RESULT_BUDGET=40000 # Aggregate token budget for ALL tool results in
# # a single session. When total tool result tokens
# # exceed this, new results are truncated. Prevents
# # context bloat from large API responses (e.g.,
# # 5 connector calls × 8K each = 40K). Default is
# # generous enough for normal use.
# REACT_COMPLETION_CHECK_SKIP_CHARS=800 # Skip the completion-check LLM call when the
# # agent's final answer exceeds this many chars.
# # Long detailed answers don't need a "did I miss
# # anything?" verification round-trip. Set lower
# # to skip more aggressively; set to 999999 to
# # always run the check.
# REACT_CYCLE_DETECTION_THRESHOLD=2 # Identical tool call count before injecting a
# # "try a different approach" warning. Deterministic
# # loop breaker — doesn't rely on the LLM noticing.
# REACT_COMPLETION_CHECK_MIN_TOOLS=3 # Minimum tool calls before the completion
# # checklist fires. Set higher to skip verification
# # on simple tasks; set to 1 for always-on.
# REACT_TURN_PROFILE_ENABLED=true # Emit per-turn phase-level timing logs
# # (memory_load, compact, tool_schema_build,
# # llm_first_token, llm_total, tool_exec). Set to
# # false to disable profiling entirely (no-op).
# ── Model-layer rate limiting ────────────────────────────────────────────────
# LLM_RATE_LIMIT_PER_USER=true # Per-user keyed rate limit buckets instead of
# # a single process-global bucket. Prevents one
# # noisy user from starving all others on the
# # same worker. Set to false to revert to the
# # legacy global bucket (not recommended).
# ── DAG Planner ──────────────────────────────────────────────────────────────
# DAG_STEP_TIMEOUT=600 # Step execution timeout in seconds
# DAG_VERIFY_TRUNCATION=2000 # Max chars sent to step verifier
# DAG_ANALYZER_TRUNCATION=10000 # Max chars per step result in analyzer
# DAG_REPLAN_RECENT_TRUNCATION=500 # Chars per step in latest replan round
# DAG_REPLAN_OLDER_TRUNCATION=200 # Chars per step in older replan rounds
# Domain classification — independent LLM layer before ReAct / DAG execution
# ESCALATION_DOMAINS=legal,medical,financial,tax,compliance,patent # Specialist domains → reasoning model upgrade + domain SOP instructions + citation verification (DAG)
# DAG_CITATION_VERIFICATION=true # Post-step citation accuracy check (requires domain classification hit)
# DAG_STRUCTURED_CONTEXT_MULTIPLIER=3.0 # Extra truncation budget for citations & tables between DAG steps (DAG only)
# Step execution
MAX_CONCURRENCY=5 # How many DAG steps can run in parallel
DAG_STEP_MAX_ITERATIONS=15 # Max tool-call iterations within each DAG step
# Re-planning — when the agent judges the goal was NOT achieved after a round
# of execution, it can autonomously re-plan and retry:
DAG_MAX_REPLAN_ROUNDS=3 # Max autonomous re-plan attempts (user interrupts are unlimited)
DAG_REPLAN_STOP_CONFIDENCE=0.8 # If the agent is ≥80% sure the goal CANNOT be achieved,
# stop retrying instead of wasting tokens on hopeless re-plans
# (0.0 = never stop early, 1.0 = stop on any failure)
# DAG_TOOL_CACHE=true # Cache identical tool calls within DAG execution
# DAG_STEP_VERIFICATION=false # LLM-based verification of step results (adds latency)
# ══════════════════════════════════════════════════════════════════════════════
# Context & Workspace
# ══════════════════════════════════════════════════════════════════════════════
# ── Context Guard ────────────────────────────────────────────────────────────
# CONTEXT_GUARD_DEFAULT_BUDGET=32000 # Default token budget for context management
# CONTEXT_GUARD_MAX_MSG_CHARS=50000 # Hard limit on single message character count
# CONTEXT_GUARD_KEEP_RECENT=4 # Messages to keep when compacting history
# ── Workspace ────────────────────────────────────────────────────────────────
# Agent Workspace — tool output offloading to prevent context pollution
# When a tool output exceeds this threshold (chars), it is saved to a workspace file
# and a truncated preview is injected into the conversation context.
# WORKSPACE_OFFLOAD_THRESHOLD=8000
# WORKSPACE_PREVIEW_CHARS=2000 # Preview chars for truncated workspace refs
# WORKSPACE_CLEANUP_MAX_HOURS=72 # Max age in hours before workspace cleanup
# ── System ───────────────────────────────────────────────────────────────────
# (SYSTEM_PROMPT_RESERVE is no longer used — ContextGuard accounts for the
# system prompt dynamically when estimating message list token counts.)
# ══════════════════════════════════════════════════════════════════════════════
# Web Tools
# ══════════════════════════════════════════════════════════════════════════════
# ── Search ───────────────────────────────────────────────────────────────────
# Provider: jina (default) | tavily | brave | exa
# Auto-detect: if WEB_SEARCH_PROVIDER is unset, the first available API key wins.
# WEB_SEARCH_PROVIDER=jina
# Jina — https://jina.ai/ (also powers fetch, embedding, reranker if no override)
JINA_API_KEY=
# Tavily — https://tavily.com/
# TAVILY_API_KEY=
# Brave — https://brave.com/search/api/
# BRAVE_API_KEY=
# Exa — https://exa.ai/
# EXA_API_KEY=
# Optional Exa knobs (all have sensible defaults):
# EXA_SEARCH_TYPE=auto # auto | neural | fast | deep-lite | deep | deep-reasoning | instant
# EXA_CATEGORY= # company | research paper | news | personal site | financial report | people
# EXA_INCLUDE_DOMAINS= # comma-separated allowlist (max 1200)
# EXA_EXCLUDE_DOMAINS= # comma-separated blocklist (max 1200)
# EXA_START_PUBLISHED_DATE= # ISO-8601, e.g. 2025-01-01
# EXA_END_PUBLISHED_DATE= # ISO-8601
# EXA_MAX_AGE_HOURS= # Only return pages indexed in the last N hours
# EXA_INCLUDE_HIGHLIGHTS=true # Pull highlight snippets from results
# EXA_TEXT_MAX_CHARS=800 # Cap on text/snippet length per result
# EXA_SUMMARY_QUERY= # Optional custom summary prompt per result
# ── Fetch ────────────────────────────────────────────────────────────────────
# Provider: jina | httpx (default: jina if JINA_API_KEY set, else httpx — plain HTTP, no key needed)
# WEB_FETCH_PROVIDER=jina
# ══════════════════════════════════════════════════════════════════════════════
# RAG & Knowledge Base
# ══════════════════════════════════════════════════════════════════════════════
# ── Embedding ────────────────────────────────────────────────────────────────
# Works with ANY OpenAI-compatible /v1/embeddings endpoint.
# If EMBEDDING_API_KEY is not set, falls back to JINA_API_KEY.
#
# Provider examples (just set API_KEY + BASE_URL + MODEL):
# Jina : (default) JINA_API_KEY is enough — no override needed
# OpenAI : EMBEDDING_API_KEY=sk-... EMBEDDING_BASE_URL=https://api.openai.com/v1 EMBEDDING_MODEL=text-embedding-3-small EMBEDDING_DIMENSION=1536
# Voyage : EMBEDDING_API_KEY=pa-... EMBEDDING_BASE_URL=https://api.voyageai.com/v1 EMBEDDING_MODEL=voyage-3 EMBEDDING_DIMENSION=1024
# Ollama : EMBEDDING_API_KEY=unused EMBEDDING_BASE_URL=http://localhost:11434/v1 EMBEDDING_MODEL=nomic-embed-text EMBEDDING_DIMENSION=768
#
# ⚠️ Changing model or dimension INVALIDATES all existing KB vectors — you must rebuild indexes.
# EMBEDDING_API_KEY=
# EMBEDDING_BASE_URL=https://api.jina.ai/v1
EMBEDDING_MODEL=jina-embeddings-v3
EMBEDDING_DIMENSION=1024
# ── Retrieval ────────────────────────────────────────────────────────────────
# "grounding" = full pipeline with citations and confidence scoring
# "simple" = basic RAG (retrieve → inject into prompt)
RETRIEVAL_MODE=simple
# ── Reranker ─────────────────────────────────────────────────────────────────
# Provider: jina (default) | cohere | openai
# Auto-detect: if RERANKER_PROVIDER is unset, uses Cohere if COHERE_API_KEY set, else Jina.
# OpenAI reranker reuses LLM_API_KEY / LLM_BASE_URL — no extra key needed.
# RERANKER_PROVIDER=jina
# Jina (default)
RERANKER_MODEL=jina-reranker-v2-base-multilingual
# Cohere
# COHERE_API_KEY=
# COHERE_RERANKER_MODEL=rerank-multilingual-v3.0
# ── Vector Store ─────────────────────────────────────────────────────────────
# LanceDB, file-based — zero external services required
# VECTOR_STORE_DIR=./data/vector_store
# ══════════════════════════════════════════════════════════════════════════════
# Code Execution
# ══════════════════════════════════════════════════════════════════════════════
# CODE_EXEC_BACKEND=local # local (default) | docker
#
# !! SECURITY WARNING — PUBLIC / MULTI-USER DEPLOYMENTS !!
# local mode runs AI-generated code (Python, Node.js, shell) directly on the
# host machine as the same OS user as the server process. Node.js and shell
# have NO module-level restrictions — a malicious or buggy script can:
# - read/write arbitrary files on the host filesystem
# - make outbound network connections
# - spawn child processes
# Python has a restricted-builtins sandbox, but it is NOT a hard security
# boundary (os.system() is still reachable via the `os` module).
#
# RULE OF THUMB:
# Local machine / trusted users only → local is fine
# Internet-facing / multi-user SaaS → set CODE_EXEC_BACKEND=docker
#
# Docker mode runs every execution in an ephemeral container with full
# OS-level isolation (no access to host fs, network, or processes).
# Recommended: build the sandbox image with pre-installed packages
# (pdfplumber, Pillow, pandas, etc.) so agents can use them immediately
# inside --network=none containers:
# docker build -f Dockerfile.sandbox -t fim-sandbox:python .
#
# Then uncomment:
# DOCKER_PYTHON_IMAGE=fim-sandbox:python
# DOCKER_SHELL_IMAGE=fim-sandbox:python
#
# Or use the bare official images (no extra packages):
# DOCKER_PYTHON_IMAGE=python:3.11-slim
# DOCKER_NODE_IMAGE=node:20-slim
# DOCKER_SHELL_IMAGE=python:3.11-slim
#
# Resource limits for Docker containers (applied globally; per-agent sandbox_config overrides these):
# DOCKER_MEMORY=256m # RAM cap per container (e.g. 128m, 256m, 512m, 1g)
# DOCKER_CPUS=0.5 # CPU quota (e.g. 0.25, 0.5, 1.0, 2.0)
# SANDBOX_TIMEOUT=120 # Default execution timeout in seconds
#
# DooD (Docker-outside-of-Docker): host-side absolute path of the ./data volume mount.
# Required when fim-one runs inside a container and spawns sandbox containers via
# the host Docker socket. docker-compose.yml auto-sets this via ${PWD}/data — only
# override if your host data directory is at a non-standard location.
# DOCKER_HOST_DATA_DIR=/home/ubuntu/fim-one/data
# ── Tool Output Truncation ──────────────────────────────────────────────────
# TOOL_OUTPUT_MAX_CHARS=50000 # Max characters for JSON-aware truncation (web/API tools)
# TOOL_OUTPUT_MAX_ITEMS=10 # Max JSON array items before truncation
# TOOL_OUTPUT_MAX_BYTES=102400 # Max bytes for raw output truncation (code exec tools)
# ── Tool Artifacts ───────────────────────────────────────────────────────────
# Size limits for files produced by tool execution
# MAX_ARTIFACT_SIZE=10485760 # Max single artifact file size in bytes (default: 10 MB)
# MAX_ARTIFACTS_TOTAL=52428800 # Max total artifact size per session in bytes (default: 50 MB)
# ══════════════════════════════════════════════════════════════════════════════
# Image Generation
# ══════════════════════════════════════════════════════════════════════════════
# Provider: "google" (Gemini native API) or "openai" (OpenAI-compatible /v1/images/generations)
# IMAGE_GEN_PROVIDER=google
# IMAGE_GEN_API_KEY=AIzaSy... # Google AI Studio key (google) or proxy API key (openai)
# IMAGE_GEN_MODEL=gemini-3.1-flash-image-preview # Google: gemini-3.1-flash-image-preview / gemini-2.5-flash-image
# # OpenAI: dall-e-3 / or proxy model name (e.g. gemini-nano-banana-2)
# IMAGE_GEN_BASE_URL= # Google default: https://generativelanguage.googleapis.com/v1beta
# # OpenAI default: https://api.openai.com/v1
# ══════════════════════════════════════════════════════════════════════════════
# Connectors & Skills
# ══════════════════════════════════════════════════════════════════════════════
# Max characters for non-array JSON / plain-text responses (default: 50000)
# CONNECTOR_RESPONSE_MAX_CHARS=50000
# Max array items to keep when response is a JSON array (default: 10)
# CONNECTOR_RESPONSE_MAX_ITEMS=10
# Connector tool mode — how connector tools are exposed to the agent
# "classic" = one tool per action (legacy), "progressive" = single ConnectorMetaTool with discover/execute
# CONNECTOR_TOOL_MODE=progressive
# Skill tool mode — how skill tools are exposed to the agent
# "progressive" = stub listing + read_skill() on demand (default, saves tokens for many/long skills)
# "inline" = inject full skill content directly into system prompt (better for ≤2 short skills)
# SKILL_TOOL_MODE=progressive
# Database tool mode — how database connector tools are exposed to the agent
# "progressive" = single DatabaseMetaTool with list_tables/discover/query subcommands (default)
# "legacy" = one tool per action (list_tables, describe_table, query) per database connector
# DATABASE_TOOL_MODE=progressive
# MCP tool mode — how MCP server tools are exposed to the agent
# "progressive" = single MCPServerMetaTool with discover/call subcommands (default)
# "legacy" = one tool per MCP server action (original individual tools)
# MCP_TOOL_MODE=progressive
# ── Progressive Disclosure Tuning ──
# These control token usage in system prompts and tool outputs.
# Lower values = fewer tokens but less context for the LLM.
# CONNECTOR_DISCOVER_INDENT=2 # JSON indent in discover output (default: 2)
# SKILL_STUB_DESC_LENGTH=120 # Skill stub description truncation (default: 120, 0=off)
# DAG_PLANNER_DESC_LENGTH=120 # Tool desc truncation in DAG planning (default: 120)
# COMPACT_CATALOG_DESC_LENGTH=80 # REACT tool selection catalog truncation (default: 80)
# COMPACT_SCHEMA_DESC_LENGTH=200 # Compact OpenAI schema desc truncation (default: 200)
# Connector credential encryption key (separate from JWT_SECRET_KEY).
# When set, auth tokens stored in connector_credentials are Fernet-encrypted.
# If not set, credentials are stored as plaintext (backward-compatible).
# Changing this key invalidates all existing encrypted credentials.
# CREDENTIAL_ENCRYPTION_KEY=CHANGE_ME_TO_A_STRONG_SECRET
# ══════════════════════════════════════════════════════════════════════════════
# Email & Notifications
# ══════════════════════════════════════════════════════════════════════════════
# ── Email (SMTP) ─────────────────────────────────────────────────────────────
# When set, the email_send tool is automatically registered and available to agents.
# Supports SSL (port 465), STARTTLS (port 587), and plain (no encryption).
# SMTP_HOST=smtp.example.com
# SMTP_PORT=465
# SMTP_SSL=ssl # ssl | tls | "" (no encryption)
# SMTP_USER=no-reply@example.com
# SMTP_PASS=your-smtp-password
# SMTP_FROM=no-reply@example.com # Optional — defaults to SMTP_USER
# SMTP_FROM_NAME=FIM One # Optional display name shown in From header
# SMTP_REPLY_TO=hi@fim.ai # Optional — replies go here instead of SMTP_FROM
# SMTP_ALLOWED_DOMAINS=example.com,corp.internal # Allowlist: only permit recipients on these domains
# SMTP_ALLOWED_ADDRESSES=alice@gmail.com # Allowlist: exact addresses (combined with DOMAINS)
# Leave both unset to allow any recipient (NOT recommended for shared/public mailboxes)
# ── Notifications (Push) ─────────────────────────────────────────────────────
# Push notification providers — configure one or more to enable the send_notification
# tool and /api/notifications endpoints. Each provider is auto-discovered on startup.
# Email notifications re-use the SMTP settings above.
# Slack — create an incoming webhook at https://api.slack.com/messaging/webhooks
# SLACK_WEBHOOK_URL=https://hooks.slack.com/services/T.../B.../xxx
# Lark (Feishu) — create a bot webhook in your Lark group settings
# LARK_WEBHOOK_URL=https://open.feishu.cn/open-apis/bot/v2/hook/xxx
# WeCom (Enterprise WeChat) — create a bot webhook in your WeCom group
# WECOM_WEBHOOK_URL=https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxx
# ══════════════════════════════════════════════════════════════════════════════
# Platform & Infrastructure
# ══════════════════════════════════════════════════════════════════════════════
# ── Database ─────────────────────────────────────────────────────────────────
# Choose one:
# SQLite (default, zero-config): sqlite+aiosqlite:///./data/fim_one.db
# PostgreSQL (production): postgresql+asyncpg://user:pass@localhost:5432/fim_one
# Docker Compose auto-sets PostgreSQL — this line is ignored in Docker.
# DATABASE_URL=sqlite+aiosqlite:///./data/fim_one.db
# DATABASE_URL=postgresql+asyncpg://fim:fim@localhost:5432/fim_one
# ── Auth (JWT) ───────────────────────────────────────────────────────────────
# If not set, the system auto-generates a secure random secret on first start and saves it to .env.
# For production or multi-instance deployments, set this explicitly so tokens survive restarts and
# remain valid across all replicas.
# JWT_SECRET_KEY=CHANGE_ME_TO_A_STRONG_SECRET
# ── CORS ─────────────────────────────────────────────────────────────────────
# Extra allowed origins beyond localhost:3000/3001 (comma-separated)
# Required when your frontend runs on a non-localhost domain in production.
# CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com
# ── File Uploads ─────────────────────────────────────────────────────────────
# UPLOADS_DIR=./uploads
MAX_UPLOAD_SIZE_MB=50
NEXT_PUBLIC_MAX_UPLOAD_SIZE_MB=50
# ── Document Processing ─────────────────────────────────────────────────────
# Vision-aware document understanding — renders PDF pages as images
# for vision-capable LLMs (GPT-4o, Claude 3/4, Gemini 1.5/2, etc.)
# DOCUMENT_PROCESSING_MODE=auto # auto | vision | text (default: auto)
# DOCUMENT_VISION_DPI=150 # DPI for PDF page rendering (default: 150)
# DOCUMENT_VISION_MAX_PAGES=20 # Max pages to render as images (default: 20)
# ── MCP Servers ──────────────────────────────────────────────────────────────
# Optional, JSON array of server configs. Requires: uv sync --extra mcp
# Example: [{"name": "filesystem", "command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]}]
# MCP_SERVERS=
# MCP transport security — MUST be set to false for public/SaaS deployments.
# true → users can register MCP servers that spawn arbitrary subprocesses on the host (local dev only)
# false → only HTTP/SSE MCP transports are accepted; stdio servers are rejected
ALLOW_STDIO_MCP=true
# Allowed commands for stdio MCP servers (comma-separated base names)
# Only these binaries can be launched as MCP server processes.
# ALLOWED_STDIO_COMMANDS=npx,uvx,node,python,python3,deno,bun
# ── Workflow Run Retention ───────────────────────────────────────────────────
# Background cleanup removes old workflow runs. Per-workflow overrides take priority.
# WORKFLOW_RUN_MAX_AGE_DAYS=30 # Delete runs older than N days (default: 30)
# WORKFLOW_RUN_MAX_PER_WORKFLOW=100 # Keep at most N runs per workflow (default: 100)
# WORKFLOW_RUN_CLEANUP_INTERVAL_HOURS=24 # How often the cleanup task runs (default: 24)
# ── Channel Confirmation Request Expiry ──────────────────────────────────────
# Pending approval requests (e.g. Feishu FeishuGateHook cards) that sit
# undecided for longer than this TTL are auto-marked as expired, so a stale
# click days later doesn't flip agent state that has already been torn down.
# CHANNEL_CONFIRMATION_TTL_MINUTES=1440 # Default: 24 hours
# CHANNEL_CONFIRMATION_SWEEP_INTERVAL_SECONDS=600 # Default: every 10 minutes
# ── Logging & Workers ───────────────────────────────────────────────────────
# Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
LOG_LEVEL=INFO
# Redis URL for cross-worker interrupt/inject relay (optional)
# Docker Compose sets this automatically — no action needed for Docker deployments.
# For local dev with WORKERS>1, start a Redis instance and uncomment:
# REDIS_URL=redis://localhost:6379/0
# Uvicorn worker processes (default: 1)
# WORKERS=1 — safe default, no external services needed.
# WORKERS>1 — requires PostgreSQL (SQLite is single-writer).
# OK: Auth, OAuth, file upload/download — fully multi-worker safe (JWT-based)
# LIMITATION: Without REDIS_URL, mid-stream interrupt/inject only works within
# the SAME worker. Docker Compose auto-configures Redis.
# WORKERS=1
# ══════════════════════════════════════════════════════════════════════════════
# OAuth (optional)
# ══════════════════════════════════════════════════════════════════════════════
# When both CLIENT_ID and CLIENT_SECRET are set for a provider, the login page
# automatically shows the corresponding OAuth button. Leave a pair unset (or
# commented out) to disable that provider entirely.
#
# Callback URL pattern — register this in each provider's developer console:
# {API_BASE_URL}/api/auth/oauth/{provider}/callback
# Examples:
# Local dev → http://localhost:8000/api/auth/oauth/github/callback
# Production → https://api.yourdomain.com/api/auth/oauth/github/callback
#
# See docs/configuration/oauth-providers.mdx for full setup instructions for each provider.
# GitHub — https://github.com/settings/developers → OAuth Apps
# GITHUB_CLIENT_ID=
# GITHUB_CLIENT_SECRET=
# Google — https://console.cloud.google.com/apis/credentials
# GOOGLE_CLIENT_ID=
# GOOGLE_CLIENT_SECRET=
# Discord — https://discord.com/developers/applications
# DISCORD_CLIENT_ID=
# DISCORD_CLIENT_SECRET=
# Feishu (Lark) — https://open.feishu.cn/app → In-house app (自建应用)
# Note: uses APP_ID / APP_SECRET (not CLIENT_ID / CLIENT_SECRET)
# FEISHU_APP_ID=
# FEISHU_APP_SECRET=
# ══════════════════════════════════════════════════════════════════════════════
# URLs & Deployment
# ══════════════════════════════════════════════════════════════════════════════
# All three default to localhost and work out of the box for local development.
# For any internet-facing deployment, set all three explicitly.
#
# FRONTEND_URL — where the browser lands after OAuth completes (backend → browser redirect).
# Local dev : http://localhost:3000 (default)
# Production : https://yourdomain.com
# FRONTEND_URL=http://localhost:3000
# API_BASE_URL — externally reachable backend address, used to build OAuth callback URLs.
# Local dev : http://localhost:8000 (default)
# Production : https://yourdomain.com (if Nginx proxies /api → backend on same domain)
# : https://api.yourdomain.com (if backend has its own subdomain)
# API_BASE_URL=http://localhost:8000
# NEXT_PUBLIC_API_URL — browser-side API base URL, used by the frontend for OAuth redirects.
# Local dev : auto-detected as http://<hostname>:8000 (no need to set)
# Production : MUST be set — without it the browser would request https://yourdomain.com:8000
# which is almost certainly wrong. Set to the same value as API_BASE_URL.
# NEXT_PUBLIC_API_URL=https://yourdomain.com
# ── Cloudflare Tunnel (optional) ─────────────────────────────────────────────
# Use with docker-compose.tunnel.yml to route traffic through Cloudflare Tunnel
# instead of exposing ports directly. See docker-compose.tunnel.yml for setup steps.
# Get your token: CF Dashboard → Zero Trust → Networks → Tunnels → Create → copy token
#
# ⚠️ MAINLAND CHINA NOTICE: Cloudflare Free/Pro/Business plans have NO PoPs in
# mainland China. Traffic is routed to overseas edges (US West), causing frequent
# 502 errors. Do NOT use this if your users are in mainland China.
# Cloudflare Enterprise with China Network (JD Cloud) is required.
# CLOUDFLARE_TUNNEL_TOKEN=
# ══════════════════════════════════════════════════════════════════════════════
# Analytics (optional)
# ══════════════════════════════════════════════════════════════════════════════
# All analytics providers are optional. Set any combination — all active ones load simultaneously.
# Leave all blank to disable analytics (recommended for local dev).
#
# Google Analytics 4 — https://analytics.google.com
# NEXT_PUBLIC_GA_MEASUREMENT_ID=G-XXXXXXXXXX
#
# Umami (self-hosted, privacy-friendly) — https://umami.is
# NEXT_PUBLIC_UMAMI_SCRIPT_URL=https://your-umami-instance.com/script.js
# NEXT_PUBLIC_UMAMI_WEBSITE_ID=your-website-id
#
# Plausible (lightweight, privacy-friendly) — https://plausible.io
# NEXT_PUBLIC_PLAUSIBLE_DOMAIN=yourdomain.com
# NEXT_PUBLIC_PLAUSIBLE_SCRIPT_URL=https://plausible.io/js/script.js # optional, defaults to plausible.io CDN