English | 简体中文
A Python SDK for interacting with the PDF Craft API. It simplifies the process of converting PDFs to Markdown or EPUB by handling authentication, file upload, task submission, and result polling.
- 🚀 Easy PDF Conversion: Convert PDFs to Markdown or EPUB format
- 📤 Local File Upload: Upload and convert local PDF files with progress tracking
- 🔄 Automatic Retry: Built-in retry mechanism for robust operations
- ⏱️ Flexible Polling: Configurable polling strategies for task completion
- 📊 Progress Tracking: Monitor upload progress with callbacks
- 🔧 Type Safe: Full type hints support
You can install the package from PyPI:
pip install pdf-craft-sdkThe easiest way to convert a local PDF file:
from pdf_craft_sdk import PDFCraftClient
# Initialize the client
client = PDFCraftClient(api_key="YOUR_API_KEY")
# Upload and convert a local PDF file
download_url = client.convert_local_pdf("document.pdf")
print(f"Conversion successful! Download URL: {download_url}")💡 See examples.py for 10 complete usage examples covering all features!
If you already have a PDF URL from the upload API:
from pdf_craft_sdk import PDFCraftClient, FormatType
client = PDFCraftClient(api_key="YOUR_API_KEY")
# Convert a PDF to Markdown and wait for the result
try:
pdf_url = "https://oomol-file-cache.example.com/your-file.pdf"
download_url = client.convert(pdf_url, format_type=FormatType.MARKDOWN)
print(f"Conversion successful! Download URL: {download_url}")
except Exception as e:
print(f"An error occurred: {e}")Monitor the upload progress of large files:
from pdf_craft_sdk import PDFCraftClient, UploadProgress
def on_progress(progress: UploadProgress):
print(f"Upload progress: {progress.percentage:.2f}% "
f"({progress.current_part}/{progress.total_parts} parts)")
client = PDFCraftClient(api_key="YOUR_API_KEY")
# Upload and convert with progress tracking
download_url = client.convert_local_pdf(
"large_document.pdf",
progress_callback=on_progress
)from pdf_craft_sdk import PDFCraftClient, FormatType
client = PDFCraftClient(api_key="YOUR_API_KEY")
# Convert to EPUB with footnotes
download_url = client.convert_local_pdf(
"document.pdf",
format_type=FormatType.EPUB,
includes_footnotes=True
)If you prefer to handle the steps manually or asynchronously:
from pdf_craft_sdk import PDFCraftClient, FormatType
client = PDFCraftClient(api_key="YOUR_API_KEY")
# Step 1: Upload local file
cache_url = client.upload_file("document.pdf")
print(f"Uploaded to: {cache_url}")
# Step 2: Submit conversion task
task_id = client.submit_conversion(cache_url, format_type=FormatType.MARKDOWN)
print(f"Task ID: {task_id}")
# Step 3: Wait for completion
download_url = client.wait_for_completion(task_id)
print(f"Download URL: {download_url}")The convert and wait_for_completion methods accept optional configuration for polling behavior:
max_wait_ms: Maximum time (in milliseconds) to wait for the conversion. Default is 7200000 (2 hours).check_interval_ms: Initial polling interval (in milliseconds). Default is 1000 (1 second).max_check_interval_ms: Maximum polling interval (in milliseconds). Default is 5000 (5 seconds).backoff_factor: Multiplier for increasing interval after each check, orPollingStrategyenum. Default isPollingStrategy.EXPONENTIAL(1.5).
Available polling strategies:
PollingStrategy.EXPONENTIAL(1.5): Default. Starts fast, slows down.PollingStrategy.FIXED(1.0): Polls at a fixed interval.PollingStrategy.AGGRESSIVE(2.0): Doubles the interval each time.
from pdf_craft_sdk import PollingStrategy
# Example: Stable Polling (Every 3 seconds)
download_url = client.convert(
pdf_url="https://oomol-file-cache.example.com/your-file.pdf",
check_interval_ms=3000,
max_check_interval_ms=3000,
backoff_factor=PollingStrategy.FIXED
)
# Example: Long Running Task (Start slow, check infrequently)
download_url = client.convert(
pdf_url="https://oomol-file-cache.example.com/your-file.pdf",
check_interval_ms=5000,
max_check_interval_ms=60000, # 1 minute
backoff_factor=PollingStrategy.AGGRESSIVE
)PDFCraftClient(api_key, base_url=None, upload_base_url=None)Initialize the PDF Craft client.
Parameters:
api_key(str): Your API keybase_url(str, optional): Custom API base URLupload_base_url(str, optional): Custom upload API base URL
Upload and convert a local PDF file in one step.
Parameters:
file_path(str): Path to the local PDF fileformat_type(str | FormatType): Output format, "markdown" or "epub" (default: "markdown")model(str): Model to use (default: "gundam")includes_footnotes(bool): Include footnotes (default: False)ignore_pdf_errors(bool): Ignore PDF parsing errors (default: True)ignore_ocr_errors(bool): Ignore OCR errors (default: True)wait(bool): Wait for completion (default: True)max_wait_ms(int): Max wait time in milliseconds (default: 7200000)check_interval_ms(int): Initial polling interval in milliseconds (default: 1000)max_check_interval_ms(int): Max polling interval in milliseconds (default: 5000)backoff_factor(float | PollingStrategy): Polling backoff factor (default: PollingStrategy.EXPONENTIAL)progress_callback(callable): Upload progress callback functionupload_max_retries(int): Max upload retries per part (default: 3)
Returns: Download URL (str) if wait=True, else task ID (str)
Upload a local PDF file to cloud cache.
Parameters:
file_path(str): Path to the local PDF fileprogress_callback(callable): Progress callback functionmax_retries(int): Max retries per upload part (default: 3)
Returns: Cache URL (str)
Convert a PDF from URL.
Parameters:
pdf_url(str): PDF URL to convert (HTTPS URL from upload API)format_type(str | FormatType): Output format (default: "markdown")- Other parameters same as
convert_local_pdf
Returns: Download URL (str)
Submit a conversion task without waiting.
Parameters:
pdf_url(str): PDF URL to convertformat_type(str | FormatType): Output formatmodel(str): Model to useincludes_footnotes(bool): Include footnotesignore_pdf_errors(bool): Ignore PDF parsing errorsignore_ocr_errors(bool): Ignore OCR errors
Returns: Task ID (str)
Wait for a conversion task to complete.
Parameters:
task_id(str): Task ID fromsubmit_conversion- Polling parameters same as
convert_local_pdf
Returns: Download URL (str)
Progress information for file uploads.
Attributes:
uploaded_bytes(int): Bytes uploaded so fartotal_bytes(int): Total bytes to uploadcurrent_part(int): Current part number being uploadedtotal_parts(int): Total number of partspercentage(float): Progress percentage (0-100)
Example:
def on_progress(progress):
print(f"{progress.percentage:.1f}% - Part {progress.current_part}/{progress.total_parts}")The SDK raises the following exceptions:
FileNotFoundError: When the specified file doesn't existAPIError: When API requests failTimeoutError: When conversion exceeds max wait time
Example:
from pdf_craft_sdk import PDFCraftClient
from pdf_craft_sdk.exceptions import APIError
client = PDFCraftClient(api_key="YOUR_API_KEY")
try:
download_url = client.convert_local_pdf("document.pdf")
print(f"Success: {download_url}")
except FileNotFoundError:
print("File not found!")
except APIError as e:
print(f"API error: {e}")
except TimeoutError:
print("Conversion timed out")If you need to use a custom upload API endpoint:
client = PDFCraftClient(
api_key="YOUR_API_KEY",
upload_base_url="https://custom.example.com/upload"
)Default upload endpoint: https://llm.oomol.com/api/tasks/files/remote-cache
Process multiple files:
import os
from pdf_craft_sdk import PDFCraftClient
client = PDFCraftClient(api_key="YOUR_API_KEY")
pdf_files = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]
for pdf_file in pdf_files:
try:
print(f"Processing {pdf_file}...")
download_url = client.convert_local_pdf(pdf_file, wait=False)
print(f"Task submitted: {download_url}")
except Exception as e:
print(f"Error processing {pdf_file}: {e}")This project is licensed under the MIT License.
For issues, questions, or contributions, please visit our GitHub repository.
See examples.py for complete, runnable examples including:
- ✅ Basic local PDF conversion
- 📊 Upload with progress tracking
- 📖 EPUB format conversion
- 🔧 Manual step-by-step upload and conversion
- 🌐 Remote PDF conversion
- ⚙️ Custom polling strategies
- 🛡️ Proper error handling
- 📦 Batch processing multiple files
- 🔌 Custom upload endpoint
- ⏱️ Async workflow (submit now, check later)
Run examples:
# Get your API key from https://console.oomol.com/api-key
# Then edit examples.py and replace 'your_api_key_here' with your actual API key
# Run examples
python examples.py
# Choose a specific example (1-10) or 'all' to run all examples- ✨ Added local file upload functionality
- ✨ Added
convert_local_pdf()convenience method - ✨ Added upload progress tracking with callbacks
- 🐛 Fixed null
uploaded_partshandling in upload response - 📝 Improved documentation and examples
- Initial public release
- Basic PDF to Markdown/EPUB conversion
- Configurable polling strategies