Podcast transcription to interactive learning module automation
- Published
You've just finished recording a two-hour podcast episode. Now you face the familiar grind: send it to a transcription service, wait for results, manually edit the transcript, then build some kind of learning module around it. If you do this weekly, you're looking at four to six hours of manual work per episode.
What if that entire chain ran automatically? You record the episode, upload the file, and within an hour you have a polished transcript sitting in a Google Doc, key concepts extracted, and an interactive learning module ready for your audience.
This is where combining Hour One and Whisper API becomes powerful. Hour One is typically known for AI video generation, but it also provides a robust API for creating interactive educational content. Whisper API handles the transcription accurately and affordably. The missing piece is orchestration; that's where tools like Zapier, n8n, Make, or Claude Code come in. Each approach has trade-offs around cost, complexity, and flexibility. In this guide, we'll build this workflow and show you which orchestration tool makes sense for your situation..........
The Automated Workflow
How the workflow works
The overall logic is straightforward:
- Upload a podcast audio file to a cloud storage service (Google Drive or S3)
- Trigger a webhook when the file appears
- Send the audio to Whisper API for transcription
- Parse the transcript to identify key learning points
- Pass the structured data to Hour One's API to generate an interactive learning module
- Store the finished module in your database or learning management system
The challenge is connecting these steps without manual intervention. We'll walk through three different orchestration approaches, starting with the simplest.
Option 1: Zapier (Easiest, Medium Cost)
Zapier works well if you want to avoid writing code entirely. You'll build a multi-step Zap using Zapier's built-in actions and Webhooks by Zapier.
Step 1: Set up a trigger in Google Drive. When a new file appears in a specific folder with a .mp3 or .wav extension, Zapier detects it.
Step 2: Call Whisper API using Zapier's Webhooks action. This is where you need to know your OpenAI API key. The webhook request will look like this:
Method: POST
URL: https://api.openai.com/v1/audio/transcriptions
Headers:
Authorization: Bearer YOUR_OPENAI_API_KEY
Body (multipart/form-data):
file: [file from Google Drive]
model: whisper-1
language: en
response_format: verbose_json
The verbose_json response format gives you timestamps and confidence scores, which are useful for building the learning module.
Step 3: Parse the Whisper response using Zapier's built-in JSON tools. Extract the transcript text. If you need to identify key concepts (topics that appear most frequently or with the highest confidence), you can use a simple text processing step.
Step 4: Call Hour One's module creation API. First, format your data according to Hour One's specification:
{
"title": "Podcast Episode Title",
"description": "Brief description from podcast metadata",
"content": {
"transcript": "Full transcript from Whisper",
"key_concepts": [
"Concept 1",
"Concept 2",
"Concept 3"
],
"learning_objectives": [
"Learner will understand X",
"Learner will be able to Y"
]
},
"interactive_elements": {
"quiz_enabled": true,
"discussion_prompts": true
}
}
Then send this to Hour One's endpoint:
Method: POST
URL: https://api.hourone.ai/v1/learning-modules
Headers:
Authorization: Bearer YOUR_HOURONE_API_KEY
Content-Type: application/json
Body: [JSON structure above]
Step 5: Store the result. Hour One returns a module ID and a public URL. Use Zapier's Google Sheets or Airtable action to log the completed module. This gives you a permanent record and a way to distribute the link.
Trade-offs with Zapier: You won't hit Whisper rate limits (120 requests per minute is generous for most podcasters), but you will pay Zapier's premium tier pricing. Zapier charges per task, so if you run this 4 times per week, expect around £100–150 per month for Zapier alone, plus API costs for Whisper and Hour One. For occasional users (1–2 episodes per week), this is reasonable. For daily producers, it gets expensive.
Option 2: n8n (More Control, Lower Cost)
n8n is open-source and self-hosted, so you only pay for server costs (often £5–20 per month on a small cloud instance). The trade-off is you need to set up and maintain the infrastructure.
Here's a complete n8n workflow. You'll create nodes in this order:
Node 1: Google Drive Trigger. Set it to poll a specific folder every 10 minutes for new .mp3 files.
Node 2: Read File from Google Drive. Use the file ID from Node 1 to download the actual audio data.
Node 3: Call Whisper API. Use n8n's HTTP Request node:
Method: POST
URL: https://api.openai.com/v1/audio/transcriptions
Headers:
Authorization: Bearer YOUR_OPENAI_API_KEY
Body:
file: [binary data from Node 2]
model: whisper-1
response_format: verbose_json
prompt: "This is a podcast about [your topic]. Use industry terminology where appropriate."
The prompt parameter is optional but helpful; it improves accuracy when Whisper encounters domain-specific jargon.
Node 4: Extract Key Information. Use n8n's JavaScript node to process the transcript:
const transcript = $input.all()[0].json.text;
const words = transcript.split(/\s+/);
const wordFreq = {};
words.forEach(word => {
const clean = word.toLowerCase().replace(/[^\w]/g, '');
if (clean.length > 5) {
wordFreq[clean] = (wordFreq[clean] || 0) + 1;
}
});
const topWords = Object.entries(wordFreq)
.sort((a, b) => b[1] - a[1])
.slice(0, 5)
.map(entry => entry[0]);
return {
transcript: transcript,
key_concepts: topWords,
word_count: words.length,
duration_seconds: $input.all()[0].json.duration || null
};
This extracts the 5 most frequent long words as "key concepts". For more sophisticated concept extraction, you'd integrate a second API call to Claude or GPT-4, but this works for a quick version.
Node 5: Call Hour One API:
Method: POST
URL: https://api.hourone.ai/v1/learning-modules
Headers:
Authorization: Bearer YOUR_HOURONE_API_KEY
Content-Type: application/json
Body:
title: [from Google Drive metadata or podcast name]
description: [generate from first 200 words of transcript]
content:
transcript: [from Node 4]
key_concepts: [from Node 4]
learning_objectives: [hardcode or generate]
interactive_elements:
quiz_enabled: true
discussion_prompts: true
Node 6: Store in Airtable. Log the module ID, transcript, and Hour One URL:
Table: Podcast Modules
Fields:
Episode Title
Transcript
Hour One Module URL
Status: "Complete"
Date Created: [now]
With n8n, you control everything. The workflow is more transparent, and costs are lower. However, you're responsible for keeping your n8n instance running and updated.
Option 3: Make (Formerly Integromat)
Make sits between Zapier and n8n in terms of complexity and cost. It charges per operation (an operation is a single action like "call an API"), not per "task". If you do 50 operations per workflow run, you pay for 50 operations across your entire Make account. Zapier charges per Zap per month, so Make can be cheaper at scale.
A Make scenario is structured similarly to n8n, but the interface is more visual. You'd build modules:
- Google Drive trigger (new file)
- HTTP Request to Whisper API (call this a "Custom API" module)
- Text Parser to extract concepts
- HTTP Request to Hour One API
- Airtable to store results
The Whisper API call in Make looks like:
Module Type: HTTP - Make a request
Method: POST
URL: https://api.openai.com/v1/audio/transcriptions
Headers:
Authorization: Bearer [your key]
Body (multipart):
file: [file from Google Drive module]
model: whisper-1
response_format: verbose_json
Make's pricing is around £0.10 per 1,000 operations. At 50 operations per workflow and 4 workflows per week, you're looking at around £10–15 per month for Make, plus API costs. This is cheaper than Zapier but requires more setup than Zapier and less control than n8n.
Option 4: Claude Code (Most Flexible, Steepest Learning Curve)
If you're comfortable writing Python, Claude Code (Anthropic's API for code generation and execution) combined with a simple scheduler gives you complete flexibility and low cost.
Here's a complete Python script that orchestrates the workflow:
import anthropic
import openai
import requests
import os
from pathlib import Path
openai.api_key = os.getenv("OPENAI_API_KEY")
hourone_key = os.getenv("HOURONE_API_KEY")
client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
def transcribe_audio(audio_file_path):
"""Call Whisper API to transcribe audio."""
with open(audio_file_path, 'rb') as audio:
response = openai.Audio.transcribe(
model="whisper-1",
file=audio,
response_format="verbose_json"
)
return response['text'], response.get('duration', None)
def extract_concepts(transcript):
"""Use Claude to intelligently extract key concepts."""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
messages=[{
"role": "user",
"content": f"""Extract 5 key learning concepts from this podcast transcript.
Return as a JSON array of strings.
Transcript: {transcript[:2000]}"""
}]
)
# Parse Claude's response
import json
try:
concepts = json.loads(response.content[0].text)
except:
concepts = ["Topic 1", "Topic 2", "Topic 3"]
return concepts
def create_learning_module(title, transcript, concepts):
"""Call Hour One API to create interactive module."""
payload = {
"title": title,
"description": f"Interactive module based on podcast episode",
"content": {
"transcript": transcript,
"key_concepts": concepts,
"learning_objectives": [
f"Understand key concept: {c}" for c in concepts
]
},
"interactive_elements": {
"quiz_enabled": True,
"discussion_prompts": True
}
}
response = requests.post(
"https://api.hourone.ai/v1/learning-modules",
headers={
"Authorization": f"Bearer {hourone_key}",
"Content-Type": "application/json"
},
json=payload
)
if response.status_code == 201:
return response.json()['module_url']
else:
raise Exception(f"Hour One API error: {response.text}")
def main(audio_file_path, episode_title):
"""Main orchestration function."""
print(f"Processing: {episode_title}")
# Step 1: Transcribe
print("Transcribing audio...")
transcript, duration = transcribe_audio(audio_file_path)
print(f"Transcript complete ({duration} seconds)")
# Step 2: Extract concepts
print("Extracting key concepts...")
concepts = extract_concepts(transcript)
print(f"Concepts identified: {concepts}")
# Step 3: Create learning module
print("Creating interactive module...")
module_url = create_learning_module(episode_title, transcript, concepts)
print(f"Module created: {module_url}")
return {
"title": episode_title,
"transcript": transcript,
"concepts": concepts,
"module_url": module_url
}
if __name__ == "__main__":
result = main("podcast_episode.mp3", "Episode 42: Advanced Topics")
print(f"\nWorkflow complete! Module: {result['module_url']}")
Deploy this on a cloud function (AWS Lambda, Google Cloud Functions, or Railway.app) with a scheduled trigger (CloudScheduler, GitHub Actions, or similar) running once per hour to check for new files.
Cost with Claude Code: Storage and function execution are pennies per month. API costs are the same (Whisper + Hour One), but you avoid platform charges entirely. This approach wins on cost if you're running this regularly.
The Manual Alternative
You don't need to automate everything. Some teams prefer to:
- Upload the audio file to Whisper API manually via the web interface or a simple Python script
- Download the transcript and review it in a text editor
- Manually identify key concepts and write learning objectives
- Fill out an Hour One form to create the learning module
This gives you quality control at each step. If your podcast covers sensitive material or requires expert-level curation, the extra hour per episode might be worthwhile. The trade-off is that you've traded hours of labour for peace of mind, and that becomes expensive if you scale to multiple episodes per week.
Pro Tips
Error handling and retries. Whisper API sometimes times out on very long files (over 25 MB). If you're processing long-form audio, split it into chunks before uploading. In n8n, use the Split In Batches node. In your Python script, use the pydub library:
from pydub import AudioSegment
audio = AudioSegment.from_file("long_podcast.mp3")
chunk_size = 20 * 60 * 1000 # 20 minutes in milliseconds
for i, chunk in enumerate([audio[j:j+chunk_size] for j in range(0, len(audio), chunk_size)]):
chunk.export(f"chunk_{i}.mp3", format="mp3")
# Process each chunk through Whisper separately
Rate limiting. Whisper allows 500 requests per day per API key. If you're processing 10 episodes per day, split across multiple API keys or stagger your workflow. Most orchestration platforms (Zapier, n8n, Make) have delay nodes. Add a 30-second delay between Whisper calls.
Cost optimisation. Whisper charges per minute of audio, roughly £0.006 per minute. A 2-hour podcast costs around £0.72 in transcription fees. Hour One's API pricing depends on your plan; interactive modules typically cost £2–5 each. At 4 episodes per week, your API costs are roughly £40–50 per month, with orchestration costs on top. If this seems high, consider limiting concept extraction to only the first 30 minutes of audio (use Whisper's max_tokens parameter indirectly by splitting files).
Validation and quality checks. Add a step that checks transcript word count and alerts you if it's unusually low (which might indicate a failed transcription). In n8n or Make, add a conditional node that checks transcript.length > 5000 before proceeding to Hour One.
Storage and distribution. Don't rely solely on Hour One's public URLs for permanent storage. Always log transcripts and module data in a database you own (Airtable, Google Sheets, or a self-hosted database). This protects against API changes or service discontinuations.
Cost Breakdown
| Tool | Plan Needed | Monthly Cost | Notes |
|---|---|---|---|
| Orchestration | |||
| Zapier | Professional or higher | £70–200 | Per-task pricing adds up quickly. Need paid tier for multi-step workflows. |
| n8n | Self-hosted (cloud VM) | £5–20 | You manage infrastructure. Cheapest at scale. |
| Make | Pay-as-you-go operations | £10–30 | £0.10 per 1,000 operations. Scales with usage. |
| Claude Code | None (per-API-call) | £0 | Costs are only in the actual API calls. |
| APIs | |||
| Whisper (OpenAI) | Pay-as-you-go | £30–50 | £0.006 per minute of audio. 4 episodes/week at 2 hours each = ~£6. |
| Hour One | Depends on plan | £0–200+ | Pricing varies; basic module creation can be £2–5 per module, or included in higher tiers. |
| Google Drive / Storage | Free–Business | £0–12 | Free tier sufficient for most use cases. |
| Total (4 episodes/week) | £45–300 | Depends on orchestration choice. n8n + Claude Code is cheapest; Zapier is most expensive. |
More Recipes
Automated Podcast Production Workflow
Automated Podcast Production Workflow: From Raw Audio to Published Episode
Build an Automated YouTube Channel with AI
Build an Automated YouTube Channel with AI
Medical device regulatory documentation from technical specifications
Medtech companies spend significant resources translating technical specs into regulatory-compliant documentation.