A Cloudflare Worker API for retrieving random transcripts with speaker data from the VCTK corpus, with support for excluding specific speaker/transcript combinations. Built with Hono framework for fast, type-safe API development.
- 🎯 Random Transcript Retrieval: Get random transcripts with speaker metadata
- 🚫 Flexible Exclusions: Exclude specific speaker/transcript combinations using advanced filtering
- 📊 Rich Speaker Data: Returns transcript text, sequences, speaker demographics (age, gender, accent, region)
- ⚡ Fast Performance: Built on Cloudflare Workers with D1 database
- 🛡️ Robust Validation: Comprehensive parameter validation and error handling using Hono validators
- 🌐 CORS Support: Configurable CORS with environment-based origin control
- 🧪 Comprehensive Testing: Extensive test suite covering all functionality
https://englitune-worker.silvioprog.dev
Or your custom domain configured in Cloudflare.
Retrieve random transcripts with speaker data.
Query Parameters:
limit(optional): Number of transcripts to return (1-100, default: 1)excluded(optional): Speaker/transcript combinations to exclude
Responses:
[
{
"transcript": "You feel the pride.",
"sequence": "322",
"speaker": "p257",
"age": 24,
"gender": "F",
"accent": "English",
"region": "Southern"
}
][
{
"transcript": "But it was to no avail.",
"sequence": "070",
"speaker": "p302",
"age": 20,
"gender": "M",
"accent": "Canadian",
"region": "Montreal"
}
]Returns a 204 response with caching headers for favicon requests.
The excluded parameter supports a flexible format for excluding specific speaker/transcript combinations:
excluded=p225=001,002,003;p226=004,005;p227=006
Format Breakdown:
- Multiple speakers separated by semicolons (
;) - Each speaker followed by equals sign (
=) and comma-separated transcript sequences - Automatic deduplication (duplicate sequences are ignored)
Examples:
# Exclude specific transcripts for speaker p225
GET /?limit=5&excluded=p225=001,002,003
# Exclude transcripts for multiple speakers
GET /?limit=10&excluded=p225=001,002;p226=003,004;p227=005
# Complex exclusions with duplicates (automatically deduplicated)
GET /?excluded=p225=001,002,001,003,002Parameter validation errors:
{
"error": "Limit must be a number: abc"
}{
"error": "Excluded must be in format id=sequence1,sequence2;id2=sequence3,sequence4: invalid"
}Database or server errors:
{
"error": "Internal server error"
}- Node.js 18.17.1 or higher (required by Wrangler 4.x)
- Cloudflare account with Workers and D1 access
Note: The project includes a .nvmrc file specifying Node.js 24.12.0 for development consistency, but the project is compatible with Node.js 18.17.1+.
# Clone the repository
git clone https://github.com/silvioprog/englitune-worker.git
cd englitune-worker
# Use the Node version specified in .nvmrc (optional, but recommended)
nvm use
# Install dependencies
npm install
# Set up D1 database (update wrangler.jsonc with your database details)
wrangler d1 create vctk-corpus# Development server (with CORS set to * for local testing)
npm run dev
# Run tests
npm test
# Deploy to Cloudflare
npm run deploy
# Generate TypeScript types from Wrangler config
npm run cf-typegenUpdate wrangler.jsonc with your D1 database configuration and CORS settings:
CORS Configuration:
- Development: Uses
--var CORS_ORIGIN:"*"in the dev script to allow all origins locally - Production: Set
CORS_ORIGINinwrangler.jsoncto restrict to your domain
The project includes comprehensive test coverage with tests co-located with source files:
- Complete test suite: Tests located alongside source files in
src/ - Parameter validation tests: HTTP methods, paths, limits, exclusions (
src/validators.test.ts) - Query generation tests: SQL generation and parameter binding (
src/queries.test.ts) - Integration tests: End-to-end worker functionality including CORS (
src/index.test.ts) - Error handling tests: Database errors and malformed requests
# Run all tests
npm test
# Run specific test file
npm test src/validators.test.ts
npm test src/queries.test.ts
npm test src/index.test.tssrc/index.ts: Main Hono app entry point with CORS middleware, routing, and error handlingsrc/validators.ts: Parameter validation and parsing using Hono validatorssrc/queries.ts: Database query generation and execution
- Hono Framework: Fast, lightweight web framework optimized for Cloudflare Workers
- Type Safety: Full TypeScript support with generated types from Wrangler config
- CORS Middleware: Environment-based CORS configuration (dev:
*, production: specific domain) - Efficient Exclusions: Uses
Map<string, Set<string>>for automatic deduplication - SQL Generation: Dynamic WHERE clause generation for complex exclusions
- Error Handling: Comprehensive error handling with appropriate HTTP status codes via Hono
- Performance: Optimized SQL queries with proper indexing support
This project is licensed under the MIT License. See the LICENSE file for details.
{ "vars": { "CORS_ORIGIN": "https://your-domain.com" }, "d1_databases": [ { "binding": "DB", "database_name": "your-database-name", "database_id": "your-database-id" } ] }