Secure filesystem access for Large Language Models with governance-first design.
Stop reinventing filesystem tools for every LLM project. llm-fs-tools provides production-ready, secure file operations that work with any LLM supporting function calling - Ollama, OpenAI, Anthropic, and more.
You want your AI assistant to analyze code, search files, or explore directories. You have three bad options:
- Inject everything into the prompt - Wastes tokens, hits context limits, can't scale
- Use heavy frameworks - LangChain/LlamaIndex lock you into their ecosystem
- Roll your own - Reinvent security, path validation, and tool schemas every time
pip install llm-fs-toolsfrom llm_fs_tools import FileSystemTools, SecurityPolicy
# Define security boundaries
policy = SecurityPolicy(
allowed_roots=["./my-project"],
max_file_size_mb=5,
blocked_patterns=["*.env", ".git/*"]
)
# Initialize tools
fs_tools = FileSystemTools(policy)
# Use with any LLM (Ollama example)
import ollama
response = ollama.chat(
model='qwen2.5-coder',
messages=[{'role': 'user', 'content': 'Analyze the codebase structure'}],
tools=fs_tools.get_tool_definitions() # Auto-generates schemas
)
# Execute tool calls
for tool_call in response.message.tool_calls:
result = fs_tools.execute(
tool_call.function.name,
tool_call.function.arguments
)That's it. Your model can now safely explore filesystems.
- Path traversal protection - Validates all paths stay within allowed roots
- Configurable boundaries - Whitelist directories, block patterns
- Automatic filtering - Excludes
.env,.git,node_modulesby default - Size limits - Prevents reading massive files that blow up context
get_directory_tree- Hierarchical structure with configurable depthread_file- Read with line numbers and range supportsearch_codebase- Grep-style regex search across fileslist_directory- Fast flat listings
- Framework-agnostic - Works with raw API calls, not just frameworks
- Provider-agnostic - Same tools work with Ollama, OpenAI, Anthropic
- Minimal dependencies - No heavy frameworks required
- Standard schemas - Uses OpenAI function calling format
- Comprehensive error handling - Graceful failures with detailed messages
- Type hints throughout - Full mypy compliance
- Extensive logging - Debug tool execution and security checks
- Tested - 80%+ coverage
import ollama
from llm_fs_tools import FileSystemTools, SecurityPolicy
policy = SecurityPolicy(allowed_roots=["./src"])
fs_tools = FileSystemTools(policy)
response = ollama.chat(
model='codellama',
messages=[{
'role': 'user',
'content': 'Find all database queries in this codebase'
}],
tools=fs_tools.get_tool_definitions()
)
# Handle tool calls in a loop
messages = [{'role': 'user', 'content': 'Find all database queries'}]
while response.message.tool_calls:
messages.append(response.message)
for tool_call in response.message.tool_calls:
result = fs_tools.execute(
tool_call.function.name,
tool_call.function.arguments
)
messages.append({
'role': 'tool',
'content': json.dumps(result),
'tool_call_id': tool_call.id
})
response = ollama.chat(model='codellama', messages=messages)
print(response.message.content)from openai import OpenAI
from llm_fs_tools import FileSystemTools, SecurityPolicy
client = OpenAI()
policy = SecurityPolicy(allowed_roots=["./"])
fs_tools = FileSystemTools(policy)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Summarize the README"}],
tools=fs_tools.get_tool_definitions(format="openai")
)
# Execute tool calls
for tool_call in response.choices[0].message.tool_calls:
result = fs_tools.execute(
tool_call.function.name,
json.loads(tool_call.function.arguments)
)import anthropic
from llm_fs_tools import FileSystemTools, SecurityPolicy
client = anthropic.Anthropic()
policy = SecurityPolicy(allowed_roots=["./docs"])
fs_tools = FileSystemTools(policy)
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{"role": "user", "content": "What's in the docs?"}],
tools=fs_tools.get_tool_definitions(format="anthropic")
)
# Handle tool use
for block in response.content:
if block.type == "tool_use":
result = fs_tools.execute(block.name, block.input)Every file operation validates paths through the security policy:
policy = SecurityPolicy(
allowed_roots=[
"/home/user/projects",
"/home/user/documents"
],
blocked_patterns=[
"*.env", # Environment files
"*.key", # Key files
".git/*", # Git internals
"node_modules/*", # Dependencies
"__pycache__/*" # Python cache
],
blocked_extensions=[
".pem",
".secret"
],
max_file_size_mb=10
)Validation Process:
- Resolve symlinks and relative paths
- Check if resolved path is within
allowed_roots - Match against
blocked_patternsandblocked_extensions - Verify file size is under
max_file_size_mb
Security guarantees:
- ❌ No path traversal attacks (
../../../etc/passwd) - ❌ No symlink escapes
- ❌ No sensitive file access
- ✅ Explicit allowlist model
Security violations return structured errors, never raising exceptions to the LLM:
{
"success": False,
"error": "Access denied: Path outside allowed roots",
"data": None,
"metadata": {
"tool": "read_file",
"attempted_path": "/etc/passwd",
"violation_type": "outside_allowed_roots"
}
}Returns hierarchical directory structure.
Parameters:
path(str, required) - Directory to analyzemax_depth(int, default=3) - Maximum recursion depthinclude_hidden(bool, default=False) - Include hidden files
Example Output:
{
"success": true,
"data": {
"name": "src",
"type": "directory",
"children": [
{
"name": "main.py",
"type": "file",
"size": 1024
},
{
"name": "utils",
"type": "directory",
"children": [...]
}
]
}
}Reads file content with optional line ranges.
Parameters:
path(str, required) - File to readstart_line(int, optional) - First line to read (1-indexed)end_line(int, optional) - Last line to read (inclusive)
Example:
# Read entire file
fs_tools.execute("read_file", {"path": "./main.py"})
# Read lines 10-20
fs_tools.execute("read_file", {
"path": "./main.py",
"start_line": 10,
"end_line": 20
})Grep-style search with regex support.
Parameters:
pattern(str, required) - Search pattern (regex)path(str, required) - Directory to searchfile_pattern(str, default="*") - File glob filtercase_sensitive(bool, default=False) - Case sensitivitymax_results(int, default=100) - Result limit
Example:
# Find all TODO comments in Python files
fs_tools.execute("search_codebase", {
"pattern": r"TODO:.*",
"path": "./src",
"file_pattern": "*.py"
})Output:
{
"success": true,
"data": {
"matches": [
{
"file": "./src/main.py",
"line": 42,
"content": "# TODO: Refactor this function",
"match": "TODO: Refactor this function"
}
],
"total_matches": 1,
"truncated": false
}
}Fast flat directory listing.
Parameters:
path(str, required) - Directory to listinclude_hidden(bool, default=False) - Include hidden files
from llm_fs_tools import FileSystemTools, SecurityPolicy
policy = SecurityPolicy(
allowed_roots=["./project"],
)
fs_tools = FileSystemTools(policy)from pathlib import Path
policy = SecurityPolicy(
# Multiple allowed directories
allowed_roots=[
"./src",
"./docs",
str(Path.home() / "projects")
],
# File size limits
max_file_size_mb=5,
# Block sensitive patterns
blocked_patterns=[
"*.env",
"*.key",
"*.pem",
".git/*",
"node_modules/*",
"__pycache__/*",
"*.pyc",
".venv/*"
],
# Block by extension
blocked_extensions=[
".secret",
".private"
],
# Custom validation
custom_validator=lambda path: not path.name.startswith("temp_")
)
fs_tools = FileSystemTools(policy)# llm-fs-config.yaml
security:
allowed_roots:
- ./src
- ./docs
max_file_size_mb: 10
blocked_patterns:
- "*.env"
- ".git/*"import yaml
from llm_fs_tools import SecurityPolicy, FileSystemTools
with open("llm-fs-config.yaml") as f:
config = yaml.safe_load(f)
policy = SecurityPolicy(**config["security"])
fs_tools = FileSystemTools(policy)- Governance Over Scale - Security boundaries define capability, not model size
- Explicit Over Implicit - Allowlists, not denylists
- Simple Over Complex - Minimal API surface, zero magic
- Portable Over Coupled - Works everywhere, depends on nothing
┌─────────────────────────────────────────┐
│ Your Application │
│ (Ollama/OpenAI/Anthropic/etc) │
└─────────────┬───────────────────────────┘
│
├─ get_tool_definitions()
│ (Returns JSON schemas)
│
└─ execute(name, args)
(Runs tool, returns result)
│
┌───────────────┴────────────────┐
│ FileSystemTools │
│ ┌──────────────────────────┐ │
│ │ Security Policy │ │
│ │ - Path validation │ │
│ │ - Size limits │ │
│ │ - Pattern blocking │ │
│ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ Tool Implementations │ │
│ │ - get_directory_tree │ │
│ │ - read_file │ │
│ │ - search_codebase │ │
│ │ - list_directory │ │
│ └──────────────────────────┘ │
└─────────────────────────────────┘
# Let Claude explore and refactor your codebase
policy = SecurityPolicy(allowed_roots=["./src"])
fs_tools = FileSystemTools(policy)
response = claude.chat(
messages=[{
'role': 'user',
'content': 'Refactor the authentication module for better testability'
}],
tools=fs_tools.get_tool_definitions()
)# LLM reviews your PR changes
policy = SecurityPolicy(
allowed_roots=["./"],
blocked_patterns=["*.env", "node_modules/*"]
)
fs_tools = FileSystemTools(policy)
response = gpt4.chat(
messages=[{
'role': 'user',
'content': 'Review the changes in src/ for security issues and best practices'
}],
tools=fs_tools.get_tool_definitions()
)# Generate docs from codebase structure
policy = SecurityPolicy(allowed_roots=["./src", "./docs"])
fs_tools = FileSystemTools(policy)
response = ollama.chat(
model='codellama',
messages=[{
'role': 'user',
'content': 'Generate API documentation from the source files'
}],
tools=fs_tools.get_tool_definitions()
)# Find all imports and dependencies
fs_tools.execute("search_codebase", {
"pattern": r"^import |^from .* import",
"path": "./src",
"file_pattern": "*.py"
})| Feature | llm-filesystem-tools | LangChain | MCP Servers | Roll Your Own |
|---|---|---|---|---|
| Installation | pip install |
pip install langchain |
Server setup + client | ❌ N/A |
| Dependencies | Minimal | 50+ packages | MCP protocol | ❌ You maintain |
| Security Model | Built-in policy engine | Manual | Per-server | ❌ You build |
| Provider Support | All (OpenAI/Anthropic/Ollama) | LangChain models only | MCP clients only | ✅ Up to you |
| Framework Lock-in | ❌ None | ✅ LangChain ecosystem | ✅ MCP protocol | ❌ None |
| Path Validation | ✅ Automatic | ❌ Manual | Varies | ❌ You build |
| Learning Curve | 5 minutes | Days | Hours | ❌ Weeks |
- ✅ Core filesystem tools
- ✅ Security policy engine
- ✅ Multi-provider schemas
- ✅ Path validation
- Caching layer for repeated reads
- File watching/change detection
- Batch operations
- Performance optimizations
- Git integration tools
- Diff/patch operations
- Binary file support
- Archive handling (zip, tar)
- Stable API
- Full test coverage
- Production hardening
- Performance benchmarks
We welcome contributions! This project follows the governance-first philosophy: intelligence emerges from coordination, not complexity.
# Clone the repo
git clone https://github.com/dansasser/llm-filesystem-tools.git
cd llm-filesystem-tools
# Create virtual environment
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
ruff check .
mypy llm_fs_tools- Security first - All PRs must maintain security guarantees
- Test coverage - New features need tests
- Type hints - Full typing required
- Documentation - Update docs for API changes
- 🔧 New tool implementations
- 🛡️ Enhanced security features
- 📚 Documentation improvements
- 🧪 Test coverage expansion
- 🐛 Bug fixes
Q: Does this work with LangChain/LlamaIndex?
A: Yes! You can wrap these tools in LangChain/LlamaIndex tool interfaces, but you don't need those frameworks to use this package.
Q: Can I use this in production?
A: Yes, but audit the security policy for your use case. The default blocked patterns are a starting point, not a complete security solution.
Q: What about write operations?
A: Currently read-only by design. Write operations may come in v0.3.0 with additional safeguards.
Q: Does this work on Windows?
A: Yes! Path handling is cross-platform using pathlib.
Q: Can I use this with streaming responses?
A: Yes! Tool calls work with both streaming and non-streaming LLM responses.
Q: What's the performance impact?
A: Minimal. Tool execution is typically <100ms. Directory trees are cached per call.
MIT License - see LICENSE for details.
Created by Dan Sasser as part of the SIM-ONE Framework - governance-first AI architecture.
Related Projects:
- ollama-prompt - Command-line tool using llm-filesystem-tools
- SIM-ONE - Comprehensive AI governance system
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: Contact
Star this repo if it's useful! ⭐