A patient arrives at your clinic with diabetes. The intake form gets printed, scanned, filed. Later, a staff member manually writes a video script about diabetes management, sends it to a video production tool, records narration separately, and stitches it all together. Three separate processes, three opportunities for error, three hours of labour per patient. What if that entire chain triggered automatically the moment the intake PDF was uploaded? The clinic could generate a personalised educational video before the patient leaves the waiting room, tailored to their specific condition and risk factors. This workflow does exactly that. It reads patient intake forms, extracts the relevant medical information, generates a narrated video script, synthesises natural-sounding voiceover, and produces a finished video. No manual handoffs. No copy-pasting between systems. Just data moving through a pipeline that runs while you do other work.
The Automated Workflow
The orchestration backbone here is n8n, which handles HTTP requests natively and offers better healthcare integration options than Zapier. Make (Integromat) would also work, but n8n's node-based visual editor is clearer for medical workflows where audit trails matter. The flow follows this sequence: 1. Intake PDF arrives in a monitored folder or email inbox.
-
Chat With PDF by Copilot.us extracts patient condition, symptoms, and risk factors.
-
GPT-4.1 generates a personalised education script based on that extraction.
-
ElevenLabs Turbo v2.5 synthesises the voiceover.
-
Hour One produces the final video with a virtual presenter reading the script.
-
The completed video is saved to your clinic's file storage and linked to the patient record. Here's how to configure this in n8n:
Step 1: Monitor for Incoming PDFs
Set up an n8n trigger that watches for new files. If using email, use the Gmail trigger. If using cloud storage, use Google Drive or Dropbox.
Trigger: Google Drive "New File"
Filter: name matches "*.pdf"
Folder: Clinic Intake Forms
Step 2: Extract Information with Chat With PDF
Chat With PDF by Copilot.us doesn't have native n8n integration, so you'll call it via HTTP. First, upload your PDF to their service and get the document ID, then use their API:
POST https://api.copilot.us/chat-with-pdf/query
Headers: Authorization: Bearer YOUR_API_KEY Content-Type: application/json Body:
{ "document_id": "{{ $json.document_id }}", "query": "Extract the patient's primary condition, symptoms, medication history, and any allergies mentioned in this intake form. Return as structured JSON.", "response_format": "json"
}
The response will look like:
json
{ "condition": "Type 2 Diabetes", "symptoms": ["increased thirst", "fatigue", "blurred vision"], "medications": ["Metformin 500mg"], "allergies": ["Penicillin"], "risk_factors": ["obesity", "sedentary lifestyle"]
}
Step 3: Generate Personalised Script with GPT-4.1
Pass that extracted data to OpenAI's API to generate a video script tailored to the patient's specific situation:
POST https://api.openai.com/v1/chat/completions
Headers: Authorization: Bearer YOUR_OPENAI_KEY Content-Type: application/json Body:
{ "model": "gpt-4.1", "messages": [ { "role": "system", "content": "You are a friendly healthcare educator creating a 90-second educational video script for a patient. Use clear, non-technical language. Address the patient directly. Include one piece of actionable advice." }, { "role": "user", "content": "Create a video script for a patient with {{ $json.condition }}. Key symptoms: {{ $json.symptoms }}. Current medications: {{ $json.medications }}. Allergies: {{ $json.allergies }}. The script should be exactly 90 seconds when read aloud at a natural pace." } ], "temperature": 0.7, "max_tokens": 300
}
Store the returned script in a variable for the next step.
Step 4: Synthesise Voiceover with ElevenLabs
Convert the script to audio using ElevenLabs Turbo v2.5:
POST https://api.elevenlabs.io/v1/text-to-speech/{{ voice_id }}
Headers: xi-api-key: YOUR_ELEVENLABS_KEY Content-Type: application/json Body:
{ "text": "{{ $json.script }}", "model_id": "eleven_turbo_v2_5", "voice_settings": { "stability": 0.5, "similarity_boost": 0.75 }
}
This returns a binary MP3 file. Save it temporarily to a variable or n8n's internal storage.
Step 5: Generate Video with Hour One
Create the final video using Hour One's API, combining the script and voiceover:
POST https://api.hourone.com/api/video
Headers: Authorization: Bearer YOUR_HOUR_ONE_KEY Content-Type: application/json Body:
{ "script": "{{ $json.script }}", "presenter": "female-presenter-1", "voice_url": "{{ $json.audio_url }}", "video_format": "1080p", "background": "clinic_neutral"
}
Hour One returns a video ID and a processing status. Use n8n's wait node to poll the status endpoint every 30 seconds until the video is ready:
GET https://api.hourone.com/api/video/{{ video_id }}/status
Headers: Authorization: Bearer YOUR_HOUR_ONE_KEY
When status is "completed", the response includes a download URL.
Step 6: Store and Link the Video
Download the finished video and save it to your clinic's file storage (Google Drive, OneDrive, or S3). Then update your patient management system with the video URL:
POST https://your-clinic-system.com/api/patients/{{ patient_id }}/education_video
Headers: Authorization: Bearer YOUR_CLINIC_API_KEY Content-Type: application/json Body:
{ "video_url": "{{ $json.hour_one_video_url }}", "condition": "{{ $json.condition }}", "created_at": "{{ now }}", "expires_at": null
}
At this point, the workflow is complete. A video exists, linked to the patient's record, ready to show during their appointment or send via patient portal.
The Manual Alternative
If you prefer finer control over each step, run this as a semi-automated sequence in n8n with manual approval nodes. After extraction, send the script to a staff member for review before generating audio. After the video is created, have a clinician verify it matches the patient's actual needs before linking it to the record. This adds 10-15 minutes of human time per patient but catches edge cases where GPT-4.1 misinterprets a patient's condition or ElevenLabs produces awkward phrasing. Alternatively, generate multiple script versions using GPT-4.1 mini (faster, cheaper) and let staff choose which one to use before video production.
Pro Tips
Monitor API rate limits aggressively.
ElevenLabs allows 10,000 characters per minute on Turbo v2.5.
If your clinic processes 20 patients in an hour with average scripts of 300 characters each, you're well within limits. But add detailed scripts and you'll throttle. Set n8n's "retry on rate limit" to exponential backoff with a 60-second maximum wait.
Store intermediate outputs in case of failure.
If Hour One's API goes down mid-video, your script and audio are already generated. Save those to a database so you can retry video generation without re-running the entire pipeline. In n8n, use the "Set" node to write to a simple database or Google Sheets.
Validate the extracted condition before script generation.
Add a "Code" node that checks whether the extracted condition matches a whitelist of conditions your clinic treats. If it doesn't, pause the workflow and notify staff. This prevents videos being generated for patients with conditions outside your scope. Use Claude Opus 4.6 instead of GPT-4.1 for script generation if your clinic needs HIPAA compliance. Claude's training data excludes scraped healthcare content, making it safer for sensitive information. GPT-4.1 is faster and cheaper, so benchmark both on your actual patient data. For more on this, see Healthcare clinic patient education video creation. For more on this, see Healthcare patient education video creation from clinical.... For more on this, see Healthcare patient education video pipeline.
Set a timeout for Hour One video processing.
Videos occasionally fail silently. Set n8n to wait a maximum of 20 minutes for processing. If still pending after that, trigger a fallback: either queue the patient for manual video creation or send an alert to staff.
Cost Breakdown
| Tool | Plan Needed | Monthly Cost | Notes |
|---|---|---|---|
| n8n | Cloud Professional | £300 | 50 workflows, 200,000 executions/month; sufficient for small clinic |
| Chat With PDF by Copilot.us | Pro | £80 | Processes 500 documents/month; covers roughly 50 patients at 10 documents per patient |
| OpenAI (GPT-4.1) | Pay-as-you-go | £120 | ~300 script generations at 300 tokens per script, $0.015 per 1K input tokens |
| ElevenLabs Turbo v2.5 | Creator | £88 | 100,000 characters per month; covers ~330 voiceovers at 300 characters each |
| Hour One | Professional | £400 | 20 videos/month included; additional videos £15 each |
| Total | £988 | Covers 50 patient workflows monthly with headroom |