Medical research paper digest and clinical insight extraction

Medical researchers face a recurring bottleneck: the volume of published papers far exceeds the time available to read and digest them. A clinician might receive a list of ten newly published studies relevant to their current patient case, but extracting actionable insights from each one takes hours. They need the key findings, methodology strengths and weaknesses, and clinical implications, not the full PDF text.

This is where automation becomes valuable. Rather than manually copying PDF text into multiple tools, highlighting sections, and rewriting summaries, you can create a workflow that processes papers end-to-end. The system ingests a research PDF, extracts and summarises the content, explains complex terminology and methods, then generates structured clinical insights without any manual handoff between tools.

We'll walk through building this with three complementary AI tools orchestrated through a workflow engine. The setup takes a few hours initially, but then runs on a schedule or triggered by email, saving weeks of manual work annually.

The Automated Workflow

What You'll Need

Before wiring everything together, gather the following:

A source for PDFs: email, Dropbox, Google Drive, or a shared folder
An n8n, Zapier, or Make account (we'll use n8n as it offers the most control for this use case)
API keys for PDF Guru AI and ExplainPaper
Terrakotta AI access (check their documentation for API availability)
A destination for results: a database, spreadsheet, or email

Why n8n for This Workflow

Zapier is simpler to set up but charges per task; Make offers good value but has a steeper learning curve. n8n is self-hosted or cloud-based, gives you fine control over request formatting, and doesn't charge per workflow execution. For a medical research pipeline that might run dozens of times monthly, n8n's fixed pricing makes sense.

Step 1: Trigger the Workflow

The workflow starts when a new PDF arrives. Common triggers include:

Email received with attachment
New file in Google Drive folder
File dropped into Dropbox
Webhook POST request (if you're building a custom interface)

For this example, we'll use an email trigger. In n8n, set up an Email trigger node that watches a dedicated inbox. Configure it to extract the attachment filename and file data.


Email Trigger Configuration:
- Protocol: IMAP
- Mailbox: medical-papers@yourorg.com
- Attachment handling: Download and pass as binary
- Check interval: 15 minutes

The workflow now has access to the PDF file as binary data. Pass this along with the sender's email address for later notification.

Step 2: Summarise with PDF Guru AI

The first processing step uses PDF Guru AI to extract and condense the paper. This tool reads the entire PDF and produces a structured summary in seconds.

Create an HTTP Request node in n8n with the following configuration:


POST https://api.pdf-guru-ai.com/v1/summarise
Headers:
  Authorization: Bearer YOUR_PDF_GURU_API_KEY
  Content-Type: application/json

Body:
{
  "file": "{{ $binary.data }}",
  "summary_length": "medium",
  "focus": "medical_research",
  "output_format": "json"
}

The response will include:

Abstract summary (150-200 words)
Key findings array
Methodology overview
Limitations noted by the authors
Conclusion

Store this output in a variable named pdfSummary. You'll reference this in subsequent steps.

Step 3: Deep Dive with ExplainPaper

ExplainPaper excels at explaining dense academic language. Rather than feeding the entire PDF again, send the summary and methodology sections. This keeps API costs down and focuses the analysis on parts that need clarification.

Set up another HTTP Request node:


POST https://api.explainpaper.com/v1/explain
Headers:
  Authorization: Bearer YOUR_EXPLAINPAPER_API_KEY
  Content-Type: application/json

Body:
{
  "text": "{{ $node['PDF Guru AI'].json.methodology }}\n\n{{ $node['PDF Guru AI'].json.abstract }}",
  "explanation_depth": "advanced",
  "target_audience": "clinician",
  "include_terminology": true
}

ExplainPaper will return:

Simplified explanations of statistical methods
Glossary of key terms with clinical relevance
Highlighted assumptions and potential weaknesses
Links to related concepts

Store this as explainedMethodology.

Step 4: Extract Clinical Insights with Terrakotta AI

Terrakotta AI specialises in clinical decision support. Pass it the summary and explained methodology, and ask it to extract actionable insights for patient care.


POST https://api.terrakotta-ai.com/v1/clinical-insights
Headers:
  Authorization: Bearer YOUR_TERRAKOTTA_API_KEY
  Content-Type: application/json

Body:
{
  "paper_summary": "{{ $node['PDF Guru AI'].json }}",
  "explained_methods": "{{ $node['ExplainPaper'].json }}",
  "context": {
    "specialty": "internal_medicine",
    "patient_population": "adults"
  },
  "output_sections": [
    "clinical_applicability",
    "patient_groups_affected",
    "practice_changes_suggested",
    "contraindications_and_caveats",
    "evidence_strength"
  ]
}

Terrakotta returns structured insights:

Whether this finding changes current practice
Which patient populations benefit most
Strength of evidence (using GRADE or similar)
Practical next steps for implementation
Relevant contraindications or precautions

Step 5: Compile and Deliver Results

Now combine all three outputs into a single, readable report. Use a Set node to structure the final document:

{
  "metadata": {
    "source_file": "{{ $node['Email Trigger'].json.attachment.filename }}",
    "processed_date": "{{ new Date().toISOString() }}",
    "sender": "{{ $node['Email Trigger'].json.from }}"
  },
  "summary": "{{ $node['PDF Guru AI'].json.abstract }}",
  "key_findings": "{{ $node['PDF Guru AI'].json.findings }}",
  "methodology_explained": "{{ $node['ExplainPaper'].json.simplified_explanation }}",
  "terminology": "{{ $node['ExplainPaper'].json.glossary }}",
  "clinical_insights": {
    "applicability": "{{ $node['Terrakotta AI'].json.clinical_applicability }}",
    "affected_populations": "{{ $node['Terrakotta AI'].json.patient_groups }}",
    "practice_changes": "{{ $node['Terrakotta AI'].json.suggested_changes }}",
    "evidence_strength": "{{ $node['Terrakotta AI'].json.grade_rating }}"
  },
  "caveats": "{{ $node['Terrakotta AI'].json.contraindications }}"
}

For delivery, create branching paths:

Option A: Email the clinician Use Gmail or SendGrid to email the compiled report (formatted as HTML or markdown) back to the sender.

Option B: Save to a database Use a PostgreSQL or MongoDB node to store the structured data. Add an index on processed_date and sender for easy querying later.

Option C: Update a spreadsheet Connect to Google Sheets or Airtable to append a new row with key fields. This creates a searchable archive of all processed papers.

A practical setup combines all three: save to the database for permanence, update the spreadsheet for team visibility, and email a formatted digest to the clinician who submitted it.

The Manual Alternative

If you prefer more control or find the automated workflow too opaque, the same three tools work individually but require manual handoff:

Download the PDF and upload to PDF Guru AI via their web interface. Copy the summary.
Take the methodology section from step 1 and paste it into ExplainPaper's editor. Read through the explanations and make notes.
Open Terrakotta AI, input your annotated findings, and request clinical insights.
Manually format the results into a document.

This approach takes 20-30 minutes per paper but gives you absolute control over what gets sent to each tool and lets you intervene if the automated summaries feel incomplete. It's worth doing manually for your most critical papers, especially ones informing major treatment decisions.

Pro Tips

Rate Limiting and Cost Control

All three API services impose rate limits. PDF Guru AI allows 100 requests per minute on most plans; ExplainPaper allows 50 per minute; Terrakotta AI allows 30 per minute. If you process many papers, stagger requests using n8n's "Wait" node to introduce 2-second delays between API calls. This prevents hitting rate limits and distributes costs across multiple billing periods if you're on a per-request plan.


n8n Wait Node Configuration:
- Wait type: 'Fixed time'
- Amount: 2
- Unit: 'Seconds'

Error Handling for Malformed PDFs

Some research PDFs are poorly scanned images or have corrupted metadata. PDF Guru AI might fail silently or return incomplete text. Add error handling:


Add an 'If' node after PDF Guru AI:
- Condition: $node['PDF Guru AI'].json.abstract != null
- True path: Continue to ExplainPaper
- False path: Send error email to admin with PDF filename

This prevents downstream tools from receiving empty inputs.

Batch Processing Large Paper Volumes

If you receive 50 papers weekly, don't process all simultaneously. n8n's "Schedule" trigger can batch process every evening:


Schedule Configuration:
- Trigger: Every day at 22:00
- Action: Process all PDFs uploaded in the past 24 hours
- Batch size: 5 papers per batch
- Delay between batches: 5 minutes

This spreads API calls across an evening, avoids rate limits, and delivers results the next morning.

Validate Clinical Outputs

Terrakotta AI is powerful but sometimes overgeneralises findings. Add a final human review step: flag any paper where the suggested practice changes contradict current guidelines at your institution. A simple database query identifies these before they reach clinicians.

Cost Optimisation with Caching

If the same paper gets processed twice (common in team environments), cache the results. Store PDF summaries by file hash:


Before calling PDF Guru AI:
1. Compute SHA-256 hash of the PDF
2. Query your database: SELECT * FROM summaries WHERE file_hash = ?
3. If found, skip API calls 2 and 3, jump straight to Terrakotta
4. If not found, process normally and save result

This can cut API costs by 20-40% in busy departments.

Cost Breakdown

Tool	Plan Needed	Monthly Cost	Notes
PDF Guru AI	Professional (1000 req/month)	£25	Overages cost £0.03 per request; upgrade to Enterprise (10,000 req/month) at £150 if processing >100 papers weekly
ExplainPaper	Academic + API (500 req/month)	£30	Includes priority support; clinicians often qualify for academic discounts
Terrakotta AI	Healthcare (100 req/month)	£50	Includes HIPAA compliance; essential if handling patient-identifiable data in clinical context
n8n Cloud	Pro (variable execution cost)	£30-80	Execution cost depends on workflow complexity; basic medical workflows run ~5-10 per execution, 50 papers/month = £25-50
Google Sheets (optional)	Free (or Workspace)	£0-12	Free tier works for <2,500 rows; upgrade to Workspace if archiving >200 papers monthly
Total		£135-197	One-time setup: 4-6 hours; ongoing: <30 minutes monthly for monitoring

For a medium-sized clinic or research team processing 50-100 papers monthly, this workflow costs roughly £150-200 and eliminates 40-60 hours of manual reading and annotation work annually. The financial case is strong.

Medical research paper digest and clinical insight extraction

The Automated Workflow

The Manual Alternative

Pro Tips

Cost Breakdown

More Recipes

User onboarding video series from feature documentation

Course curriculum and assessment generation from subject outline

Technical documentation generation from code