Alchemy RecipeIntermediateworkflow

Synthesia training videos from your Notion docs in 30 minutes

Extract your internal docs from the Notion API, convert them into video scripts with Claude, and generate presenter-led training videos in Synthesia. The whole pipeline takes about 30 minutes of setup and costs under £60/month.

Time saved
Saves 3-5 hrs per training video
Monthly cost
~£50-60 / $60-75/mo
Published

Your company has 40 internal docs in Notion covering onboarding, product features, compliance procedures, and tool guides. They are well written, up to date, and nobody reads them. New hires skim the first two pages and then ask a colleague. The docs are not the problem. The format is.

Short training videos with a presenter walking through each topic have a dramatically higher completion rate than text documents. The problem is that producing training videos traditionally means booking a room, setting up a camera, writing a script, recording, editing, and doing it again every time the content changes. That is a week of work per batch.

This workflow automates the entire chain. It pulls your docs from Notion's API, converts each one into a video script using Claude, and generates presenter-led training videos in Synthesia. When a doc changes, you re-run the script and get an updated video.

What you'll build

A Python script that:

  • Fetches pages from a Notion database via the API
  • Sends each page's content to Claude with a prompt that converts documentation into a concise presenter script
  • Creates a Synthesia video for each script, with an AI avatar, branded background, and chapter markers
  • Outputs a list of video URLs ready to embed in your LMS or internal wiki

Prerequisites

  • A Notion workspace with a database of docs you want to convert. You need a Notion integration token with read access to the database. Create this at notion.so/my-integrations.
  • A Synthesia account. Synthesia's pricing is by request, but the Starter tier typically runs around £22/month and includes 10 minutes of video per month. For a batch of 10 short training videos (2-3 minutes each), you will need the Creator tier or higher.
  • An Anthropic API key for Claude. The script calls Claude to convert each doc into a script. At roughly 2000 input tokens and 800 output tokens per doc, a batch of 40 docs costs about $0.50 total.
  • Python 3.10+ with requests, anthropic, and python-dotenv.
  • About 30 minutes for initial setup. Subsequent runs are unattended.

How to build it

Step 1: Set up the Notion integration

Create a new integration at notion.so/my-integrations. Give it a name like "Video Generator", select your workspace, and grant it "Read content" capability. Copy the integration token.

Go to your Notion database, click the three-dot menu, and choose Add connections > Your integration name. Without this step, the API returns empty results regardless of the token.

Note the database ID from the URL. It is the 32-character hex string between the last slash and the question mark.

Step 2: Fetch docs from Notion

import os
import requests
from dotenv import load_dotenv

load_dotenv()

NOTION_TOKEN = os.getenv("NOTION_TOKEN")
DATABASE_ID = os.getenv("NOTION_DATABASE_ID")

headers = {
    "Authorization": f"Bearer {NOTION_TOKEN}",
    "Notion-Version": "2022-06-28",
    "Content-Type": "application/json",
}

def get_pages():
    url = f"https://api.notion.com/v1/databases/{DATABASE_ID}/query"
    response = requests.post(url, headers=headers, json={})
    return response.json()["results"]

def get_page_content(page_id):
    url = f"https://api.notion.com/v1/blocks/{page_id}/children"
    response = requests.get(url, headers=headers)
    blocks = response.json()["results"]
    text_parts = []
    for block in blocks:
        block_type = block["type"]
        if block_type in ("paragraph", "heading_1", "heading_2", "heading_3", "bulleted_list_item", "numbered_list_item"):
            rich_texts = block[block_type].get("rich_text", [])
            text = "".join(rt["plain_text"] for rt in rich_texts)
            if text:
                text_parts.append(text)
    return "\n\n".join(text_parts)

This fetches every page in the database and extracts the plain text content from each block. The Notion API returns content as blocks, not as a single text field, so you need to walk the block tree.

Step 3: Convert docs to video scripts with Claude

import anthropic

client = anthropic.Anthropic()

def doc_to_script(title, content):
    response = client.messages.create(
        model="claude-sonnet-4-6-20250514",
        max_tokens=1200,
        messages=[{
            "role": "user",
            "content": f"""Convert this internal documentation into a training video
script for an AI presenter. The script should:
- Be 200-400 words (roughly 2 minutes of spoken content)
- Use a conversational, direct tone
- Start with a one-sentence summary of what the viewer will learn
- Break complex procedures into numbered steps
- End with one key takeaway
- Use British English spelling

Document title: {title}

Document content:
{content}

Return only the script text, no stage directions or formatting."""
        }]
    )
    return response.content[0].text

Claude does well at compressing a 1500-word doc into a 300-word script without losing the essential steps. Review the first few scripts before generating the full batch. If the tone is too formal or too casual, adjust the prompt.

Step 4: Generate Synthesia videos

SYNTHESIA_API_KEY = os.getenv("SYNTHESIA_API_KEY")

def create_video(title, script):
    url = "https://api.synthesia.io/v2/videos"
    payload = {
        "title": title,
        "input": [{
            "avatarSettings": {
                "horizontalAlign": "center",
                "scale": 1.0,
                "style": "rectangular",
                "background_fit": False
            },
            "avatar": "anna_costume1_cameraA",
            "background": "white_studio",
            "scriptText": script,
        }],
        "visibility": "private",
    }
    headers = {
        "Authorization": SYNTHESIA_API_KEY,
        "Content-Type": "application/json",
    }
    response = requests.post(url, json=payload, headers=headers)
    return response.json()["id"]

Synthesia queues the video for rendering. A 2-minute video typically takes 5-10 minutes to render. Poll the video status endpoint until it returns "complete", then grab the download URL.

Step 5: Tie it together

def main():
    pages = get_pages()
    print(f"Found {len(pages)} docs to convert")

    for page in pages:
        title = page["properties"]["Name"]["title"][0]["plain_text"]
        content = get_page_content(page["id"])

        if len(content) < 100:
            print(f"  Skipping {title} (too short)")
            continue

        print(f"  Converting: {title}")
        script = doc_to_script(title, content)
        video_id = create_video(title, script)
        print(f"    Video queued: {video_id}")

if __name__ == "__main__":
    main()

Run this once to queue all videos. Then write a separate polling script that checks video status every 60 seconds and collects the download URLs as they complete.

Cost breakdown

  • Synthesia Creator: roughly £45/month (includes 30 minutes of video)
  • Claude API: £0.40 per batch of 40 docs
  • Notion: free (API access included on all plans)
  • Total: about £45-50/month for ongoing use

Where the AI avatar falls flat

Synthesia's avatars are good enough for internal training. They are not good enough for customer-facing content at most companies. The lip sync is slightly off, the gestures are repetitive, and anyone who watches more than three videos in a row will notice the avatar's mannerisms repeat. For internal training this is a non-issue because employees care about the content, not the presenter. For external marketing or client-facing training, record yourself or hire a presenter. The uncanny valley is real and your clients will notice.

The other limitation is that Synthesia avatars cannot demonstrate screen interactions. If your training doc includes "click the Settings gear icon, then select Team Permissions", the avatar will say those words but cannot show the screen. For docs that are primarily procedural click-through guides, a screen recording with voiceover (using a tool like Descript) is a better format than an avatar-based video.

More Recipes