Introduction
Generating medical imaging reports is one of those tasks that feels ripe for automation but remains stubbornly manual in most clinical settings. A radiologist reviews images, dictates findings, a transcriptionist types it up, then someone crafts a patient-friendly letter explaining what it all means. Each handoff introduces delays, transcription errors, and inconsistency in patient communication. The whole process can take days.
What if you could go from raw imaging data and a radiologist's voice dictation directly to a polished patient communication letter, with zero manual steps in between? That is precisely what this workflow does. You will combine voice-to-text AI, intelligent writing tools, and voice synthesis to create a fully automated pipeline that takes a dictation recording and produces both a clinical report and a personalised patient letter.
This workflow sits at the advanced end of complexity because it involves orchestrating multiple APIs, handling audio files, managing conditional logic, and maintaining quality across different output formats. But once it is running, it eliminates repetitive work and creates consistency that is difficult to achieve manually.
The Automated Workflow
The core idea is straightforward: capture the radiologist's voice dictation, convert it to structured report text, generate a patient-friendly summary letter, then produce an audio version for patients who prefer listening. Let us walk through how to build this with n8n, which gives you the most granular control over the workflow.
Architecture Overview
The workflow follows this sequence:
- A dictation audio file arrives (via email, API upload, or cloud storage).
- Resemble AI transcribes and analyses the audio.
- Hyperwrite takes the transcription and generates a properly formatted clinical report.
- Hyperwrite then creates a patient-friendly letter from the clinical report.
- ElevenLabs converts the patient letter to high-quality speech.
- All outputs are saved to a patient record system or sent via email.
This assumes you already have a mechanism to trigger the workflow (like an email webhook or a scheduled check of an upload folder). We will focus on the core orchestration.
Setting Up n8n
Start by creating a new workflow in n8n. You will need accounts with each service and their respective API keys stored as n8n credentials.
First, set up your credentials in n8n:
-
Resemble AI: API key from your Resemble dashboard.
-
Hyperwrite: API key (obtained during signup).
-
ElevenLabs: API key from your account settings.
Create n8n environment variables to store these securely, then reference them in each node.
Step 1:
Trigger and Audio Input
Your workflow needs a trigger. Common options are:
-
Webhook: A radiologist or technician uploads a file to a web form.
-
Email: Dictation arrives as an email attachment.
-
Cloud Storage: A new file appears in Google Drive or S3.
For this example, assume the workflow starts via a webhook that receives a JSON payload with a URL to the audio file:
{
"audio_url": "https://storage.example.com/recording-12345.wav",
"patient_id": "PAT-9876",
"patient_name": "Jane Smith",
"imaging_type": "Chest X-ray"
}
In n8n, create a "Webhook" trigger node that listens for POST requests.
Step 2:
Transcription with Resemble AI
Add a Resemble AI node to transcribe and analyse the audio. Resemble AI offers both speech-to-text and voice analysis capabilities; you will primarily use the transcription endpoint here.
The Resemble API endpoint for async transcription is:
POST https://api.resemble.ai/v2/transcriptions
In n8n, configure an HTTP request node (or use a custom node if available) with the following:
Method: POST
URL: https://api.resemble.ai/v2/transcriptions
Headers:
Authorization: Bearer YOUR_RESEMBLE_API_KEY
Content-Type: application/json
Body (JSON):
{
"audio_url": "{{ $json.audio_url }}",
"language_code": "en-GB",
"transcription_format": "detailed"
}
Resemble will return a job ID. Store this and then poll for completion, or wait for a webhook callback if Resemble supports it (check your API docs). For simplicity, add a "Wait" node set to 10-15 seconds, then retrieve the transcription result:
GET https://api.resemble.ai/v2/transcriptions/{job_id}
Once the transcription is complete, n8n will have the full text output. Extract the transcription text and pass it downstream.
Step 3:
Generate Clinical Report with Hyperwrite
Now you have the raw transcription. Hyperwrite is designed for intelligent text generation; you will use it to structure the transcription into a proper medical report with appropriate sections (Findings, Impression, Clinical Recommendation, etc.).
Set up an HTTP node to call Hyperwrite's API:
POST https://api.hyperwrite.com/v1/generate
Configure the request as follows:
{
"method": "POST",
"url": "https://api.hyperwrite.com/v1/generate",
"headers": {
"Authorization": "Bearer YOUR_HYPERWRITE_API_KEY",
"Content-Type": "application/json"
},
"body": {
"prompt": "You are a medical report formatter. Convert the following radiologist dictation into a structured medical imaging report with clear sections for Technique, Findings, Impression, and Recommendations. Maintain clinical accuracy and use proper medical terminology.\n\nDictation:\n{{ $json.transcription }}\n\nImaging Type: {{ $json.imaging_type }}\n\nFormatted Report:",
"max_tokens": 1000,
"temperature": 0.3
}
}
The low temperature (0.3) ensures consistent, professional output. Hyperwrite returns the generated report text. Save this as a variable for the next step and as a database record.
Step 4:
Generate Patient-Friendly Letter
With the clinical report now structured, generate a patient-friendly summary. This is crucial because patients should understand their results without medical jargon, but the content must remain accurate.
Add another Hyperwrite node:
{
"method": "POST",
"url": "https://api.hyperwrite.com/v1/generate",
"headers": {
"Authorization": "Bearer YOUR_HYPERWRITE_API_KEY",
"Content-Type": "application/json"
},
"body": {
"prompt": "You are a healthcare communication specialist. Convert the following medical imaging report into a clear, compassionate letter for the patient. Use simple language, avoid jargon, and reassure the patient while being honest. Include next steps or follow-up recommendations if applicable.\n\nClinical Report:\n{{ $json.clinical_report }}\n\nPatient Name: {{ $json.patient_name }}\n\nDear {{ $json.patient_name }},\n\nThank you for coming in for your {{ $json.imaging_type }}. Here is what we found:",
"max_tokens": 800,
"temperature": 0.4
}
}
Store the resulting patient letter as a variable.
Step 5:
Voice Synthesis with ElevenLabs
ElevenLabs provides high-quality text-to-speech. Convert the patient letter into an MP3 file that can be sent to the patient or made available for download.
Add an HTTP node for ElevenLabs:
POST https://api.elevenlabs.io/v1/text-to-speech/{voice_id}
First, determine which voice to use. You can list available voices:
GET https://api.elevenlabs.io/v1/voices
Then call the text-to-speech endpoint:
{
"method": "POST",
"url": "https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM",
"headers": {
"xi-api-key": "YOUR_ELEVENLABS_API_KEY",
"Content-Type": "application/json"
},
"body": {
"text": "{{ $json.patient_letter }}",
"model_id": "eleven_monolingual_v1",
"voice_settings": {
"stability": 0.5,
"similarity_boost": 0.75
}
}
}
The response will be audio data. Save this to cloud storage (AWS S3, Google Cloud Storage, or similar) and generate a signed URL that expires in 7 days so the patient can download it.
Step 6:
Save Outputs and Notify
Finally, save both the clinical report and patient letter to your patient record system. This might involve:
- Database update: Insert or update a record with the patient ID, clinical report, and letter content.
- Email notification: Send the patient letter via email, optionally attaching the audio file or including a link to download it.
- Logging: Store metadata about the workflow execution (timestamps, API response times, any errors encountered).
Add an n8n node to send email:
{
"method": "POST",
"headers": {
"Authorization": "Bearer YOUR_EMAIL_SERVICE_API_KEY"
},
"body": {
"to": "{{ $json.patient_email }}",
"subject": "Your {{ $json.imaging_type }} Results",
"html": "<h2>{{ $json.patient_name }}, Your Test Results</h2>\n{{ $json.patient_letter }}\n<p><a href='{{ $json.audio_download_url }}'>Listen to your results</a></p>",
"attachments": []
}
}
Conditional Logic and Error Handling
In reality, this workflow needs branching:
-
If the transcription is too short or unclear, flag it for manual review rather than proceeding.
-
If Hyperwrite detects that the generated report is missing critical sections, prompt the radiologist to re-record.
-
If ElevenLabs fails, still send the written letter but note that audio is unavailable.
In n8n, use "Switch" nodes to implement these conditions:
IF transcription_length < 50 words
THEN send alert to radiologist
ELSE continue to Step 3
IF clinical_report contains all required sections (Findings, Impression)
THEN continue
ELSE create manual review task
The Manual Alternative
If you prefer not to build this in n8n, you can use Make (Integromat) or Zapier instead, though they have more limited support for complex audio handling. Alternatively, you could orchestrate this using Claude Code, which allows you to write Python scripts that call these APIs sequentially.
A Claude Code approach might look like this:
import requests
import time
def orchestrate_medical_report(audio_url, patient_id, patient_name, imaging_type):
# Step 1: Transcribe
resemble_response = requests.post(
"https://api.resemble.ai/v2/transcriptions",
headers={"Authorization": f"Bearer {RESEMBLE_API_KEY}"},
json={"audio_url": audio_url, "language_code": "en-GB"}
)
job_id = resemble_response.json()["uuid"]
# Poll for completion
for _ in range(30):
result = requests.get(
f"https://api.resemble.ai/v2/transcriptions/{job_id}",
headers={"Authorization": f"Bearer {RESEMBLE_API_KEY}"}
)
if result.json()["status"] == "completed":
transcription = result.json()["text"]
break
time.sleep(5)
# Step 2: Generate clinical report
report_response = requests.post(
"https://api.hyperwrite.com/v1/generate",
headers={"Authorization": f"Bearer {HYPERWRITE_API_KEY}"},
json={
"prompt": f"Format this medical dictation into a report:\n{transcription}",
"max_tokens": 1000
}
)
clinical_report = report_response.json()["text"]
# Step 3: Generate patient letter
letter_response = requests.post(
"https://api.hyperwrite.com/v1/generate",
headers={"Authorization": f"Bearer {HYPERWRITE_API_KEY}"},
json={
"prompt": f"Convert to patient-friendly letter:\n{clinical_report}",
"max_tokens": 800
}
)
patient_letter = letter_response.json()["text"]
# Step 4: Synthesise voice
voice_response = requests.post(
"https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM",
headers={"xi-api-key": ELEVENLABS_API_KEY},
json={"text": patient_letter, "model_id": "eleven_monolingual_v1"}
)
audio_file = voice_response.content
return {
"clinical_report": clinical_report,
"patient_letter": patient_letter,
"audio": audio_file
}
This approach gives you direct control but requires a server to run on.
Pro Tips
1. Handle Transcription Quality Issues Early
Resemble AI is reliable, but background noise or poor audio quality will degrade transcription. Before passing the transcription to Hyperwrite, validate it: check that it is longer than 50 words, contains medical terminology consistent with the imaging type, and does not have obvious garbling. If any check fails, trigger a manual review step instead of generating a report.
2. Version Your Prompts
The Hyperwrite prompts you use for report generation and patient communication matter enormously. Store them as n8n variables so you can iterate without modifying the workflow itself. Track which prompt version generated which report so you can correlate output quality with prompt changes.
3. Rate Limiting and Costs
ElevenLabs charges per character synthesised. A typical patient letter is 300-400 words; at ElevenLabs' standard pricing, that is roughly 1-2 cents per letter. Across thousands of patients, this adds up. Consider synthesising audio only on request (i.e., when a patient opts in) rather than automatically for every report.
Resemble AI charges per minute of audio. A typical dictation is 2-5 minutes; factor this into your monthly budget.
4. Implement Retry Logic
All three APIs can occasionally fail or timeout. In n8n, use the "Retry" feature on each HTTP node; set it to retry up to 3 times with exponential backoff. For Resemble's async transcription, implement a maximum wait time (e.g., 60 seconds) before giving up and flagging the workflow as failed.
5. Audit and Compliance
Medical data is sensitive. Ensure that:
-
All API calls happen over HTTPS.
-
API keys are never logged or exposed in error messages.
-
Patient letters are encrypted at rest in your database.
-
Your n8n instance is deployed on a secure, HIPAA-compliant server (not the free cloud version).
-
You maintain logs of who generated which reports and when, for compliance audits.
6. Monitor Output Quality
Hyperwrite is intelligent but not infallible. Occasionally, it may generate a report missing a key section or a patient letter with confusing phrasing. Build in a quality gate: have a human (perhaps a senior radiologist) manually review a sample of outputs weekly. Track error rates and adjust prompts accordingly.
Cost Breakdown
| Tool | Plan Needed | Monthly Cost | Notes |
|---|---|---|---|
| Resemble AI | Standard | £50–150 | Based on transcription minutes; assume 50–150 dictations/month at 3–5 min each. |
| Hyperwrite | API Access | £30–80 | Pricing varies by token usage; estimate 500k–1.2M tokens/month for report + letter generation. |
| ElevenLabs | Starter or Pro | £20–100 | At ~1 cent per letter, costs scale with volume. Pro plan ($99/month) covers ~10k letters. |
| n8n | Self-hosted or Cloud | £0–50 | Self-hosted on your server is free; cloud "Team" plan is £40/month for enterprise features. |
| Storage (AWS S3 or similar) | Standard | £5–15 | For archiving audio files and PDFs; minimal if you store only audio downloads temporarily. |
| Total | £105–395 | Assumes 50–150 workflows/month with moderate token and audio usage. Scales linearly. |
For a high-volume clinic running 500+ reports monthly, costs will trend toward the upper range; for a small practice doing 20–30 reports monthly, expect the lower end. Calculate your own scenario based on actual API usage from pilot runs.