LLM Qualitative Coding

Training materials for learning to use Large Language Models for qualitative research coding and analysis.

What is this?

A hands-on training repository that teaches:

How to make API calls to LLMs from Python
What embeddings are and how to use them for semantic search
How to apply LLM capabilities to qualitative coding workflows
Practical techniques: theme extraction, direct coding, inductive clustering

Who is this for?

Research staff who want to:

Use LLMs programmatically for qualitative analysis
Work with interview transcripts and qualitative data
Build AI-augmented coding and analysis tools
Understand embeddings and semantic similarity

No prior LLM experience required. Basic Python knowledge helpful.

Repository structure

llm-quali-coding/
├── docs/                    # Session guides with step-by-step instructions
│   ├── session_01_setup.md
│   ├── session_02_embeddings_rag.md
│   └── session_03_quali_coding.md
├── examples/                # Runnable Python scripts
│   ├── 01_test_connection.py
│   ├── 02_translate_transcript.py
│   ├── 03_create_embeddings.py
│   ├── 04_relevance_filtering.py
│   ├── 05_theme_classification_embeddings.py
│   ├── 06_extract_themes_llm.py
│   ├── 07_direct_coding_llm.py
│   ├── 08_nonverbal_coding_llm.py
│   └── 09_inductive_clustering.py
├── src/                     # Reusable code modules
│   ├── chunking.py
│   ├── coding.py
│   ├── embeddings.py
│   ├── llm_tasks.py
│   ├── openai_client.py
│   └── similarity.py
└── data/                    # Sample data for exercises
    ├── sample_transcripts/
    └── themes/

Training sessions

Day 1: Foundations

Session 01 (1 hour): Local setup and your first LLM API call
Session 02 (1 hour): Introduction to embeddings and RAG

Day 2: Qualitative Coding Applications

Session 03 (2 hours): Qualitative coding techniques with LLMs

Getting started

Clone this repository
Follow the setup instructions in docs/session_01_setup.md
Complete sessions in order

Each session builds on the previous one.

Requirements

Python 3.11+
OpenAI API key
Code editor (VS Code or Positron recommended)
Basic command line familiarity

Support

Session guides contain detailed instructions and troubleshooting
Example scripts include inline comments
Common issues documented in each session guide

Learning outcomes

By the end of this training, you will:

Have a working local LLM development environment
Understand how to use LLM APIs programmatically
Know what embeddings are and when to use them for qualitative analysis
Be able to apply multiple LLM-based coding techniques to qualitative data
Have practical experience with theme extraction, classification, and clustering

Start here: docs/session_01_setup.md

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
.vscode		.vscode
data		data
docs		docs
examples		examples
src		src
.env.example		.env.example
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Justfile		Justfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Qualitative Coding

What is this?

Who is this for?

Repository structure

Training sessions

Day 1: Foundations

Day 2: Qualitative Coding Applications

Getting started

Requirements

Support

Learning outcomes

About

Uh oh!

Releases

Packages

Languages

License

PovertyAction/llm-quali-coding

Folders and files

Latest commit

History

Repository files navigation

LLM Qualitative Coding

What is this?

Who is this for?

Repository structure

Training sessions

Day 1: Foundations

Day 2: Qualitative Coding Applications

Getting started

Requirements

Support

Learning outcomes

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages