Back to Alchemy
Alchemy RecipeBeginnerworkflow

Onboarding video creation with talking avatars

24 March 2026

Introduction

Creating onboarding videos manually is a grind. Your team records themselves on camera, edits footage, exports files, uploads them somewhere, and spends hours tweaking audio sync. By the time a new hire watches it, the information feels stale and the process feels outdated.

The good news: you can automate the entire thing. Generate a script from your documentation, convert it to realistic speech, animate a talking avatar, and deliver a polished video—all without touching a camera or editing software. The result is faster onboarding, consistent messaging, and a system that scales as your company grows.

This workflow uses three specialist tools: ElevenLabs for voice generation, HeyGen for avatar animation, and Hour One as a fallback video generation platform. We'll wire them together using an orchestration tool so that triggering the workflow happens once, and everything else runs automatically.

The Automated Workflow

The workflow operates in four stages: script generation, voice synthesis, avatar animation, and video delivery. Data flows from step to step without manual intervention.

Step 1:

Trigger and Script Generation

Start with a webhook or scheduled trigger. You could trigger this manually via a button click, or on a schedule (e.g., whenever new documentation is published). For this example, we'll use a manual trigger that accepts input: the topic of the onboarding video, the company name, and the target audience.

If you're using n8n or Make, you set up an HTTP webhook node that listens for a POST request. The payload looks like this:

{
  "topic": "How to submit an expense report",
  "company_name": "Acme Corp",
  "audience": "New finance team members",
  "avatar_preference": "professional_woman"
}

Next, call Claude's API to generate a script. Use the Claude API with a system prompt that ensures the output is conversational, clear, and approximately 90 seconds long (roughly 225 words).

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 500,
    "system": "You are a professional video scriptwriter. Create a clear, friendly script for an onboarding video. The script should be 85-95 seconds when spoken at a normal pace (approximately 220-240 words). Use conversational language. Avoid jargon. Include a warm greeting and a clear call to action at the end.",
    "messages": [
      {
        "role": "user",
        "content": "Write an onboarding video script for: Topic: How to submit an expense report. Company: Acme Corp. Audience: New finance team members."
      }
    ]
  }'

The response includes the generated script in the content[0].text field. Store this for the next step.

Step 2:

Text-to-Speech with ElevenLabs

ElevenLabs converts your script into realistic, natural-sounding speech. You need an ElevenLabs API key and a chosen voice ID. ElevenLabs provides preset voices, or you can clone a voice from a sample audio file.

Call the ElevenLabs text-to-speech endpoint:

curl https://api.elevenlabs.io/v1/text-to-speech/{voice_id} \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "text": "Welcome to Acme Corp. In this video, youll learn how to submit an expense report...",
    "model_id": "eleven_monolingual_v1",
    "voice_settings": {
      "stability": 0.5,
      "similarity_boost": 0.75
    }
  }' \
  --output audio.mp3

Replace {voice_id} with an actual voice ID from ElevenLabs (for example, 21m00Tcm4TlvDq8ikWAM is a common preset). The response is an MP3 file. Store the file URL or base64-encoded content for the next step.

Key parameters:

  • stability controls how consistent the voice sounds; 0.5 is neutral.

  • similarity_boost makes the voice match the selected voice more closely; higher values (up to 1.0) increase similarity but can reduce naturalness.

Step 3:

Avatar Animation with HeyGen

HeyGen takes your audio and creates a video of a talking avatar. You first upload the audio, then request video generation with avatar customisation options.

Start by uploading the audio file:

curl https://api.heygen.com/v1/video/upload_audio \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -F "audio=@audio.mp3"

The response includes an audio_id:

{
  "data": {
    "id": "aud_abc123xyz",
    "status": "ready"
  }
}

Now create the video. HeyGen provides a library of avatars; you select one by ID. Here's an example using a professional female avatar:

curl https://api.heygen.com/v1/video/generate \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "audio_id": "aud_abc123xyz",
    "avatar_id": "wayne-public",
    "avatar_style": "circle",
    "background": {
      "type": "color",
      "color": "#ffffff"
    }
  }'

The API returns a video_id. HeyGen processes videos asynchronously, so you need to poll the status endpoint until the video is ready:

curl https://api.heygen.com/v1/video/get_result \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "content-type: application/json" \
  -d '{"video_id": "vid_xyz789"}'

Check the status field. Once it reaches completed, the video_url field contains the finished video.

Step 4:

Storage and Delivery

Download the video from HeyGen and store it in your chosen location: AWS S3, Google Drive, Vimeo, or a dedicated video hosting service. If you use Zapier or Make, both platforms have native integrations with cloud storage services.

Example using AWS S3:

aws s3 cp heygen_video.mp4 s3://your-bucket/onboarding-videos/expense-report-$(date +%s).mp4

Generate a shareable link and store metadata (title, topic, date created, video URL) in a database or spreadsheet for tracking and reuse.

Full Workflow in n8n

Here's how the complete workflow looks in n8n, which is free and self-hosted:

  1. Webhook node (listens for POST request)
  2. Claude node (generates script from topic)
  3. ElevenLabs node (converts script to audio)
  4. HeyGen node (creates avatar video from audio)
  5. Polling node (checks video generation status)
  6. AWS S3 node (uploads final video)
  7. Airtable or Sheet node (logs video metadata)

The workflow is linear; each node waits for the previous one to complete. Error handling is critical: if ElevenLabs rejects the script (e.g., due to content policy), the workflow should email you rather than crashing silently.

Add error handling with a "Catch" node in n8n:

{
  "type": "Catch",
  "resource": "workflow",
  "operation": "executeWorkflow",
  "workflowId": "error-notification"
}

This routes failures to a separate workflow that sends an alert.

The Manual Alternative

If you prefer to keep human hands in the process, you can execute each step individually. Use ElevenLabs' web interface to upload a script and generate audio. Use HeyGen's no-code dashboard to upload audio, select an avatar, and generate a video. Download the result and upload it to your hosting platform.

This approach works well if you want to review and approve the script before audio generation, or if you want to iterate on avatar selection. The trade-off is time: expect 15 to 20 minutes per video instead of under two minutes with automation.

Pro Tips

Rate limiting and quotas. Both ElevenLabs and HeyGen impose rate limits on API calls. ElevenLabs allows 500,000 characters per month on the free plan. HeyGen limits free users to 10 minutes of video generation per month. If you run this workflow for multiple employees, upgrade to a paid plan. Monitor usage with dashboard API tokens and set alerts in your orchestration tool to pause the workflow if you approach monthly limits.

Audio quality and voice selection. ElevenLabs' eleven_multilingual_v2 model sounds more natural than v1 but consumes more characters. For onboarding videos aimed at international teams, use multilingual voices. Test voice samples before running the full workflow; save the voice ID in a variable so you can swap voices without editing the workflow.

Avatar selection. HeyGen offers roughly 100 avatars, but not all are suitable for professional content. Use avatars labelled "professional" or "business" for formal onboarding. Avoid avatars marked "entertainment." Diversity matters; rotate between avatars so your onboarding library doesn't feel repetitive.

Polling and timeout handling. HeyGen's video generation can take 30 seconds to 2 minutes. In your orchestration tool, set up a polling loop that checks the video status every 10 seconds, but give up after 5 minutes. This prevents the workflow from hanging indefinitely if HeyGen encounters an issue.

Cost optimisation. If you're generating many videos, cache successful audio files. If multiple onboarding topics share the same script, reuse the same audio file instead of calling ElevenLabs multiple times. Store audio file IDs in a lookup table keyed by script content hash.

Cost Breakdown

ToolPlan NeededMonthly CostNotes
ElevenLabsStarter£11100,000 characters/month. Covers roughly 400 onboarding videos at 250 words each.
HeyGenStandard£29120 minutes of video/month. Covers roughly 60 videos at 2 minutes each.
Claude APIPay-as-you-go£0–5~£0.003 per script (assuming 1,000 tokens). Covers script generation.
n8nSelf-hosted (free) or Cloud Pro£0–20Self-hosted n8n is free if you run it on your own server. Cloud version is £20/month.
AWS S3Standard£0.50–5Storage costs depend on video file size and request volume. Budget £1 per 100 videos.
Total£41–65Assumes modest volume (50–100 videos/month). Prices are per month.

If you use Zapier or Make instead of n8n, orchestration costs rise to £20–100/month depending on task volume.

End result: a fully automated onboarding video production line. New hires get consistent, professional videos without your team touching a camera. Run the workflow once at the start of each week, and by week's end, your video library has grown by five or ten fresh pieces of content. Scale it further by triggering videos from updated documentation: whenever a process guide changes, a new onboarding video is queued for generation.