Back to Alchemy
Alchemy RecipeIntermediateworkflow

Podcast transcription to interactive learning module automation

24 March 2026

Introduction

You've just recorded a two-hour podcast episode, but transcribing it manually would take days, and the resulting transcript sits inert in a document. Converting that raw audio into something your audience can actually learn from, explore, and reference requires multiple manual steps: transcription, editing, structuring, formatting, and uploading to a learning platform. Each handoff is a friction point where the process stalls.

What if that entire journey happened automatically? A listener downloads your podcast, and within hours, an interactive learning module with searchable transcripts, highlighted key concepts, and comprehension checks is ready for your course platform. No copy-pasting, no re-formatting, no middle steps.

This workflow combines two powerful AI tools with an orchestration platform to turn raw audio into educational content in a single, hands-off process. It's practical, it's repeatable, and it eliminates the tedious middle work that kills productivity.

The Automated Workflow

The workflow has three main stages: capturing the podcast file, transcribing it accurately, and converting the transcript into a structured learning module. We'll use Whisper API for transcription and Hour One for generating an interactive learning module from the transcript. The orchestration platform ties everything together, handling file transfers, API calls, and conditional logic without you touching anything.

Which Orchestration Tool to Choose

For this workflow, I'd recommend n8n if you're self-hosting or want full control over your data; Make if you want the fastest setup with minimal configuration; or Zapier if you prefer a managed, no-code interface. Claude Code is excellent for prototyping the logic before implementing it in your chosen platform. All three can handle file uploads, API authentication, and the multi-step nature of this automation.

I'll show examples using n8n and Make, as they give you the most control over error handling and data transformation.

The Data Flow

  1. A podcast file arrives (via upload, email, or a cloud storage trigger).
  2. Whisper API transcribes the audio with speaker identification.
  3. The transcript is formatted and enriched (extracting key phrases, adding timestamps).
  4. Hour One generates an interactive HTML module with embedded transcript, glossary, and mini-quizzes.
  5. The module is uploaded to your learning platform or cloud storage.

Setting Up the Trigger

Your workflow starts with a file arriving somewhere. That "somewhere" could be a Zapier-monitored email, a Google Drive folder, an n8n webhook, or a direct file upload. Let's assume the file lands in a Google Drive folder or is uploaded via a form.

If using Make, add a "Google Drive" module set to trigger on file creation in a specified folder:


Trigger: Google Drive > New file in folder
Watched folder: /Podcasts
File type: Audio (mp3, m4a, wav, etc.)

If using n8n, create a webhook or use Google Drive polling:


Trigger: Google Drive (or Webhook)
Event: New file created
Folder ID: [your-folder-id]

Step 1: Download and Prepare the Audio File

Before sending to Whisper, your orchestration platform needs to download the file and pass it along. In Make, use the Google Drive module's output directly. In n8n, you'll fetch the file content:


Node: HTTP Request
Method: GET
URL: https://www.googleapis.com/drive/v3/files/{{ $json.id }}?alt=media
Authentication: OAuth 2.0 (Google Drive)

The output is the raw audio file, which we'll need to encode as base64 or binary for the API call.

Step 2: Transcribe with Whisper API

OpenAI's Whisper API accepts audio files up to 25 MB. It returns a JSON object with the full transcript, timestamps, and optional language detection. Here's how to call it:


POST https://api.openai.com/v1/audio/transcriptions
Headers:
  Authorization: Bearer sk-[your-api-key]
Body:
  file: [binary-audio-file]
  model: whisper-1
  response_format: verbose_json
  language: en
  timestamp_granularities: segment

In Make, create an "OpenAI" module or use a generic HTTP Request:


Module: HTTP Request
Method: POST
URL: https://api.openai.com/v1/audio/transcriptions
Authentication: API Key (Authorization: Bearer)
Content type: Multipart Form Data
Fields:
  file: [audio-file-from-previous-step]
  model: whisper-1
  response_format: verbose_json

In n8n, use a similar HTTP Request node:


Node: HTTP Request
Method: POST
URL: https://api.openai.com/v1/audio/transcriptions
Authentication: Header Auth (Authorization: Bearer)
Headers:
  Content-Type: multipart/form-data
Body:
  file: [binary-data]
  model: whisper-1
  response_format: verbose_json

Whisper's response looks like this:

{
  "text": "Welcome to the podcast. Today we're discussing artificial intelligence...",
  "language": "en",
  "duration": 7215.5,
  "segments": [
    {
      "id": 0,
      "seek": 0,
      "start": 0.0,
      "end": 5.2,
      "text": "Welcome to the podcast.",
      "avg_logprob": -0.25,
      "compression_ratio": 1.2,
      "no_speech_prob": 0.001
    }
  ]
}

Step 3: Extract and Structure the Transcript

The Whisper output is functional but needs formatting for Hour One and for readability. Use a script node or code step to extract key information:

{
  "full_transcript": "[from Whisper]",
  "duration_seconds": 7215,
  "segments": "[array of segments with timings]",
  "key_phrases": ["AI", "machine learning", "automation"],
  "language": "en"
}

In n8n, add a "Function" node:


const transcript = $json.full_transcript;
const segments = $json.segments;
const duration = $json.duration;

const keyPhrases = transcript
  .split(' ')
  .filter(word => word.length > 6)
  .slice(0, 10);

return {
  full_transcript: transcript,
  segments: segments,
  duration_seconds: Math.round(duration),
  key_phrases: keyPhrases,
  title: "Podcast Episode - " + new Date().toLocaleDateString()
};

In Make, use the "Text Parser" module or a "Set Multiple Variables" module to structure the data.

Step 4: Generate the Interactive Module with Hour One

Hour One creates branded, interactive video content from text, but it also accepts transcripts to generate learning modules. You'll need to call their API to create a module object:


POST https://api.hourone.com/api/v1/modules
Headers:
  Authorization: Bearer [your-hour-one-api-key]
  Content-Type: application/json
Body:
  {
    "title": "Podcast Episode: AI Fundamentals",
    "transcript": "[full-transcript-from-whisper]",
    "format": "learning_module",
    "interactive_elements": {
      "include_quiz": true,
      "include_glossary": true,
      "include_timestamps": true
    },
    "branding": {
      "accent_colour": "#2563EB",
      "logo_url": "https://yoursite.com/logo.png"
    }
  }

Note: Hour One's exact API endpoints may vary; check their documentation for the most current module creation endpoint. The principle remains the same: send structured transcript data and receive back a module ID or URL.

In Make, add an HTTP module:


Module: HTTP Request
Method: POST
URL: https://api.hourone.com/api/v1/modules
Headers:
  Authorization: Bearer [key]
  Content-Type: application/json
Body:
  title: [from-previous-step]
  transcript: [from-whisper]
  format: learning_module
  interactive_elements:
    include_quiz: true
    include_glossary: true
    include_timestamps: true

Hour One returns a response with the module ID and URL:

{
  "id": "mod_12345abcde",
  "url": "https://learn.hourone.com/mod_12345abcde",
  "status": "processing",
  "created_at": "2024-01-15T10:30:00Z"
}

Step 5: Upload to Your Learning Platform

Once the module is created, store the module URL and metadata in your learning management system or database. If you're using a simple approach, upload the URL to a Google Sheet or Airtable:


POST https://api.airtable.com/v0/[base-id]/[table-name]
Headers:
  Authorization: Bearer [your-airtable-pat]
  Content-Type: application/json
Body:
  {
    "records": [
      {
        "fields": {
          "Episode Title": "AI Fundamentals",
          "Module URL": "https://learn.hourone.com/mod_12345abcde",
          "Transcript": "[full-transcript]",
          "Duration": 7215,
          "Date Created": "2024-01-15T10:30:00Z"
        }
      }
    ]
  }

Or if your platform has a REST API (Teachable, Kajabi, etc.), post the module directly there.

The Complete n8n Workflow Structure

A complete n8n workflow looks like this:

  1. Webhook or Google Drive trigger (file uploaded).
  2. HTTP Request node: download audio file.
  3. HTTP Request node: call Whisper API.
  4. Function node: structure the transcript data.
  5. HTTP Request node: create Hour One module.
  6. HTTP Request node: post to Airtable or your LMS.
  7. Slack notification (optional): alert you when complete.

Error Handling and Retry Logic

Both Make and n8n allow you to add error handling. For example, if Whisper fails due to audio quality, you might want to retry or send a notification:


Node: Error Handler
If previous node fails:
  1. Send email to admin
  2. Mark record as "needs-review" in database
  3. Stop workflow (do not proceed to Hour One)

In n8n, add a "Catch" node after the Whisper HTTP Request:


Node: Catch
Catch output of: Whisper HTTP Request
On error:
  - Send Slack message: "Transcription failed for [filename]"
  - Update Airtable: set Status = "error"

The Manual Alternative

If you prefer more control over each step, or if your workflow requires custom edits before publishing, you can run this semi-manually:

  1. Manually download the podcast audio.
  2. Upload to Whisper API or use OpenAI's platform directly to transcribe.
  3. Download the transcript, edit it in a text editor or Google Docs, add context or corrections.
  4. Create a module in Hour One's interface, pasting the edited transcript.
  5. Review the generated module, customise the quiz questions, adjust branding.
  6. Upload the module link to your learning platform.

This approach takes roughly 30 to 45 minutes per podcast, whereas the automated workflow requires zero ongoing effort once set up. The tradeoff is flexibility versus speed.

Pro Tips

Rate Limits and Cost Optimisation

Whisper API charges per minute of audio processed. A two-hour episode costs roughly $1.20. However, if you're processing multiple podcasts, batch them in a single workflow run to avoid repeated authentication calls. Make and n8n both support bulk operations; query all new files in a folder, then loop through them in a single workflow execution.

Hour One's pricing typically depends on module generation volume. Check with their sales team about bulk discounts if you're automating more than 20 modules per month.

File Naming and Metadata

Include the episode title, date, and speaker names in the file name or as metadata in your trigger. For example, name your files as: 2024-01-15_AI-Fundamentals_John-Doe.mp3. Your workflow can then extract this information programmatically and populate it in the module metadata without additional manual input.

Handling Long Episodes

If your podcast episodes exceed 25 MB, Whisper's limit, split the audio file before uploading. Most orchestration platforms can trigger a splitting step if the file size exceeds a threshold:


If file size > 25 MB:
  Split audio into 20-minute segments
  Run Whisper transcription on each segment
  Concatenate transcripts
Else:
  Upload entire file to Whisper

Testing and Monitoring

Before running this on production episodes, test it end-to-end with a short 10-minute practice podcast. Verify that Whisper produces clean, accurate transcripts; that Hour One generates usable modules; and that your learning platform receives the data correctly. Add logging at each step so you can debug failures.

In Make, use the "Run once" or "Trigger now" feature for testing. In n8n, use the "Execute workflow" button and inspect the output of each node.

Language and Localisation

Whisper supports multiple languages. If you produce podcasts in more than one language, specify the language code in the API request, or let Whisper auto-detect. For multilingual workflows, you might want Hour One to generate modules in different languages as well, which requires passing a language parameter to their API.

Cost Breakdown

ToolPlan NeededMonthly CostNotes
Whisper APIPay-as-you-go£0.60 per hour of audioTwo 2-hour episodes = £2.40. No minimum.
Hour OneStarter or Pro£50–£200+Depends on module volume and branding customisation. Contact sales for exact pricing.
n8n (self-hosted)Open source + server£0–£50Free software; you pay for hosting. Roughly £10–£50 per month on shared hosting.
MakePro or higher£20–£50Pro plan includes 10,000 operations per month. Sufficient for 10–20 podcast automations.
ZapierPro or Team£25–£50Similar operation limits; slower than Make for large file transfers.
Google DriveFree or Workspace£0–£10Free for personal; £10–£20 per user if part of Google Workspace.
AirtableFree or Pro£0–£20Free tier supports basic storage; Pro adds more records and automation.
Total (estimated)All tools£80–£160 per monthAssumes two podcasts per week and shared hosting. Scales with volume.

For a small podcast operation with one episode per week, total cost is roughly £50–£100 per month, including all tools and hosting. For a larger operation with daily content, costs scale with Whisper API usage but remain modest compared to hiring a transcription or instructional design service.

This workflow is practical because it eliminates handoffs, reduces transcription costs, and transforms raw audio into structured learning content without manual intervention. Once configured, it requires only that you upload your podcast file to a folder. Everything else happens automatically.