Skip to content

baggie11/word2voice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Paper to Podcast — Transform Research Papers into Audio Podcasts

Python 3.10+ Streamlit ElevenLabs LLaMA 3

Author: Bagavati Narayanan

Convert research papers into structured audio podcasts with glossary generation, enabling easier consumption of academic content.


Table of Contents


Overview

Paper to Podcast transforms academic papers into engaging audio content.
It extracts text from PDFs, converts them into podcast-style narration, and generates a glossary of key terms for easier understanding.

Key Highlights:

  • Automated conversion from PDF → structured podcast
  • Glossary generation for advanced terminology
  • High-quality TTS via ElevenLabs or gTTS
  • Interactive Streamlit UI for ease of use
  • Works with direct PDF URLs or local files

Features

Feature Description
PDF Import Provide a direct URL or upload a research paper in PDF format
AI Structuring LLaMA 3 (via Ollama) converts paper into podcast-style narration
Glossary Extraction Extracts top advanced terms with one-line definitions
Audio Generation Generates speech using ElevenLabs, or fallback to gTTS
Streamlit App Clean, interactive interface to navigate, listen, and export podcasts

Tech Stack

Component Technology
PDF Reader PyPDF2
LLM Processing LLaMA 3 (Ollama)
Text-to-Speech ElevenLabs / gTTS
Frontend UI Streamlit
Networking Requests

Setup Instructions

🔌 Installation

# Clone the repository
git clone https://github.com/your-username/paper-to-podcast.git
cd paper-to-podcast
pip install -r requirements.txt
streamlit run app.py

License

This project is licensed under the Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0) License.

You are free to:

  • Share — copy and redistribute the material in any medium or format.
  • Adapt — remix, transform, and build upon the material.

Under the following terms:

  • Attribution — You must give appropriate credit and link back to this repository.
  • NonCommercial — You may not use this material for commercial purposes.

No additional restrictions — you may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

📄 Full License: Creative Commons BY-NC 4.0
© 2025 Bagavati Narayanan. All rights reserved.

About

Transform complex academic papers into engaging, structured audio podcasts, complete with an auto-generated glossary of key terms for better understanding. This project bridges the gap between research and accessibility by using modern AI tools to narrate, summarize, and explain dense technical content in an easy-to-listen format.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages