Back to Alchemy
Alchemy RecipeIntermediateworkflow

Academic paper research and literature review synthesis

24 March 2026

Introduction

Academic research has a depth problem. You find a promising paper, spend two hours reading it, extract the key findings, then realise the conclusions don't quite apply to your work. You move to the next paper. Repeat this cycle across fifty sources, and you've lost weeks to a process that feels more like manual labour than scholarship.

The real bottleneck isn't access to papers anymore; it's the extraction and synthesis phase. You need to read PDFs intelligently, understand what matters, summarise the core arguments, and cross-reference findings across multiple sources. Do this manually and you're copying paragraphs into a notes file. Do it without proper workflow design and you'll spend more time switching between tools than actually thinking about the research.

This is where combining three focused AI tools into an automated workflow becomes genuinely useful. Rather than bouncing between Chat with PDF, ExplainPaper, and Resoomer across multiple browser tabs, you can set up a pipeline that takes a research paper, extracts key concepts, generates structured summaries, and feeds those directly into your literature review document. No copy-pasting. No manual handoffs. Just structured data flowing from one tool to the next.

The Automated Workflow

Why These Three Tools

Chat with PDF by Copilotus excels at answering specific questions about PDF content; ExplainPaper is built for understanding dense academic writing and breaking down complex sections; Resoomer generates concise summaries in multiple formats. Together, they cover the full research extraction pipeline. Used manually, they require you to switch windows and repeat questions. Automated, they become a single system.

The orchestration challenge is real: these tools have different APIs, different response formats, and different rate limits. We'll use n8n as our primary example because it handles PDF workflows well, though I'll note where Zapier and Make differ.

Architecture Overview

The workflow has four stages:

  1. PDF input and metadata extraction.
  2. Section-level analysis through ExplainPaper.
  3. Full-document summarisation through Resoomer.
  4. Structured output to your literature review database.

Data flows as JSON at each stage. Each tool receives what it needs and outputs structured data for the next tool to consume.

Step 1: Trigger and PDF Upload

Start with a webhook trigger in n8n. This lets you POST a JSON payload containing the PDF URL and metadata.


POST https://your-n8n-instance.com/webhook/research-pipeline

{
  "pdf_url": "https://arxiv.org/pdf/2401.00001.pdf",
  "paper_title": "Attention Is All You Need",
  "authors": "Vaswani et al.",
  "publication_year": 2017,
  "research_area": "Natural Language Processing"
}

Add an HTTP Request node immediately after the webhook to fetch the PDF metadata. This is lightweight and helps you validate the URL before sending it to downstream tools.


GET https://api.copilotus.ai/v1/pdf/metadata

Headers:
Authorization: Bearer YOUR_COPILOTUS_API_KEY
Content-Type: application/json

Body:
{
  "pdf_url": "{{$json.pdf_url}}"
}

The response will look like this:

{
  "pdf_id": "abc123def456",
  "page_count": 24,
  "title": "Attention Is All You Need",
  "extraction_status": "ready"
}

Store this response in a variable. You'll reference the pdf_id in subsequent requests.

Step 2: Send to ExplainPaper for Section Analysis

ExplainPaper's API doesn't process entire PDFs at once; instead, you extract key sections and send them for analysis. In n8n, you'll need an intermediate step that identifies important sections.

The simplest approach: use Chat with PDF to extract an abstract and introduction. Create an HTTP Request node:


POST https://api.copilotus.ai/v1/chat

Headers:
Authorization: Bearer YOUR_COPILOTUS_API_KEY
Content-Type: application/json

Body:
{
  "pdf_id": "{{$json.pdf_id}}",
  "question": "Provide the abstract and the first two paragraphs of the introduction. Format as JSON with keys 'abstract' and 'introduction'."
}

This returns:

{
  "abstract": "...",
  "introduction": "...",
  "pages_analysed": 2
}

Now pass the abstract to ExplainPaper's API:


POST https://api.explainpaper.com/api/v2/explain

Headers:
Authorization: Bearer YOUR_EXPLAINPAPER_API_KEY
Content-Type: application/json

Body:
{
  "text": "{{$json.abstract}}",
  "explanation_depth": "intermediate"
}

ExplainPaper responds with a more readable breakdown:

{
  "original": "...",
  "explained": "...",
  "key_concepts": ["concept1", "concept2", "concept3"],
  "difficulty_level": 3
}

Add a Set node in n8n to merge these outputs:

{
  "pdf_id": "{{$json.pdf_id}}",
  "abstract_explained": "{{$('ExplainPaper').json.explained}}",
  "key_concepts": "{{$('ExplainPaper').json.key_concepts}}",
  "introduction_raw": "{{$('ChatWithPDF').json.introduction}}"
}

Step 3: Generate Structured Summary with Resoomer

Resoomer's strength is generating multi-format summaries. Send the full PDF URL:


POST https://api.resoomer.com/api/summarize

Headers:
Authorization: Bearer YOUR_RESOOMER_API_KEY
Content-Type: application/json

Body:
{
  "url": "{{$json.pdf_url}}",
  "output_format": "json",
  "summary_percentage": 30
}

Resoomer returns:

{
  "summary": "...",
  "key_points": [...],
  "sentences": [...],
  "original_word_count": 8500,
  "summary_word_count": 2550
}

Step 4: Build the Literature Review Entry

Create a final transformation node that combines all previous outputs into a structured literature review entry:

{
  "metadata": {
    "title": "{{$json.paper_title}}",
    "authors": "{{$json.authors}}",
    "year": "{{$json.publication_year}}",
    "research_area": "{{$json.research_area}}",
    "pdf_url": "{{$json.pdf_url}}"
  },
  "analysis": {
    "abstract": "{{$('Step3').json.abstract_explained}}",
    "key_concepts": "{{$('Step3').json.key_concepts}}",
    "summary": "{{$('Step4').json.summary}}",
    "key_points": "{{$('Step4').json.key_points}}"
  },
  "metadata_extracted": {
    "page_count": "{{$('MetadataExtraction').json.page_count}}",
    "summary_word_count": "{{$('Step4').json.summary_word_count}}"
  },
  "processed_timestamp": "{{$now.toIso()}}"
}

Send this to your database or document management system. If you use Airtable:


POST https://api.airtable.com/v0/YOUR_BASE_ID/Literature%20Review

Headers:
Authorization: Bearer YOUR_AIRTABLE_TOKEN
Content-Type: application/json

Body:
{
  "records": [
    {
      "fields": {
        "Title": "{{$json.metadata.title}}",
        "Authors": "{{$json.metadata.authors}}",
        "Year": "{{$json.metadata.year}}",
        "Research Area": "{{$json.metadata.research_area}}",
        "Abstract Explained": "{{$json.analysis.abstract}}",
        "Key Concepts": "{{$json.analysis.key_concepts.join(', ')}}",
        "Summary": "{{$json.analysis.summary}}",
        "Key Points": "{{$json.analysis.key_points.join('; ')}}",
        "PDF URL": "{{$json.metadata.pdf_url}}"
      }
    }
  ]
}

The entire workflow is now automated. Feed it a PDF URL and it handles extraction, explanation, summarisation, and storage without you touching any tool directly.

Using Zapier Instead

If you prefer Zapier, the webhook trigger works the same way, but you'll chain HTTP actions rather than nodes. Zapier's advantage is simplicity; its disadvantage is that multi-step transformations require more boilerplate. Create a Zap with:

  1. Webhook by Zapier trigger.
  2. HTTP POST to Chat with PDF.
  3. HTTP POST to ExplainPaper.
  4. HTTP POST to Resoomer.
  5. Airtable Create Record action.

Each step pulls output from the previous one using Zapier's variable syntax.

Using Make (Integromat)

Make's interface is more visual than n8n but similarly capable. The workflow structure is identical; the main difference is that Make calls them "modules" rather than "nodes", and the variable reference syntax uses curly braces: {{module_name.data.field_name}}.

Using Claude Code for Complex Transformations

If your transformation logic is too complex for built-in nodes, embed a Claude Code step. This node executes Python code and can handle intricate JSON manipulations:

import json

pdf_id = input_data['pdf_id']
explained_abstract = input_data['abstract_explained']
key_concepts = input_data['key_concepts']
summary = input_data['summary']
key_points = input_data['key_points']

# Custom logic: identify research gaps mentioned in key concepts
research_gaps = []
gap_keywords = ['future work', 'open questions', 'limitations', 'further research']

for point in key_points:
    if any(keyword in point.lower() for keyword in gap_keywords):
        research_gaps.append(point)

# Build output
literature_entry = {
    'pdf_id': pdf_id,
    'abstract_explained': explained_abstract,
    'key_concepts': key_concepts,
    'summary': summary,
    'identified_gaps': research_gaps,
    'gap_count': len(research_gaps)
}

return json.dumps(literature_entry)

This is useful when your literature review needs custom metadata extraction or when you want to flag papers by specific criteria.

The Manual Alternative

Nothing wrong with processing papers manually if your pipeline is small. Open the PDF in your browser, paste sections into Chat with PDF, ask specific questions, copy the answers into a notes file, then repeat with ExplainPaper and Resoomer.

This works fine for five to ten papers. Beyond that, the repetition becomes tedious and error-prone. You'll forget to extract certain details, miss connections between papers, or lose track of which summary belongs to which source.

If you need flexibility and don't mind slower throughput, this is acceptable. If you're processing a literature review of thirty or more papers, automation pays for itself within the first week.

Pro Tips

Rate Limit Handling

All three tools have rate limits. Chat with PDF allows roughly 100 requests per hour on the free tier; ExplainPaper is more generous at 500 per day; Resoomer limits to 50 summarisations per day on paid plans. Build in delays using n8n's Wait node or Zapier's Delay action:


Add a Wait node between each step.
Set duration to 2 seconds.
This prevents accidental rate limit breaches.

For large batches, queue papers in a spreadsheet and process them with a scheduled workflow that fires once per hour. This distributes requests naturally.

Error Handling and Fallbacks

PDFs sometimes fail to parse, especially scanned documents. Add conditional logic to catch extraction failures. In n8n, use an If node after the PDF metadata request:

$json.extraction_status === "ready"

If true, proceed to analysis. If false, log the error and skip to manual review. This prevents the workflow from crashing on bad input.

Cost Optimisation

Resoomer's summarisation is the priciest step. If you're processing fifty papers, running Resoomer on all of them adds up quickly. Instead, only summarise papers that pass a relevance filter. After Chat with PDF extracts the abstract, score its relevance to your research question using Claude's API (cheap) before triggering Resoomer.

Similarly, don't run ExplainPaper on every section. Use it selectively on methodology and results sections where clarity matters most.

Tracking and Audit Trails

Add a logging step at the end of your workflow. Store the entire JSON payload (before sending to your database) in a separate Airtable or Google Sheets log. This gives you an audit trail; if a paper's summary looks wrong, you can trace exactly what the APIs returned and debug accordingly.

Handling Different PDF Types

Preprints from arXiv are clean and parse well. Institutional repositories and journal PDFs sometimes have headers, footers, and page numbers embedded in extracted text. Before sending to ExplainPaper, run a quick text cleaning step that removes common artefacts:

import re

text = input_text
# Remove page numbers
text = re.sub(r'\n\d+\n', '\n', text)
# Remove common headers/footers
text = re.sub(r'(Author manuscript|Final version)', '', text)
# Clean excessive whitespace
text = re.sub(r'\n{3,}', '\n\n', text)

return text

Cost Breakdown

ToolPlan NeededMonthly CostNotes
Chat with PDFPro£7100+ requests/hour; most frequent tool in workflow
ExplainPaperBasic£10500 explanations per day; selective use recommended
ResoomerPremium£850 summarisations daily; rate-limited, plan usage carefully
n8n (self-hosted)Free£0Runs on your server; no per-execution fees
n8n (cloud)Starter£10If you prefer not to self-host
ZapierStarter£29More expensive but easier to set up; ideal if you lack infrastructure
Make (Integromat)Core£9Mid-point between n8n and Zapier in cost and complexity
AirtablePlus£12Stores literature review entries; alternative is Google Sheets (free)
Total (n8n self-hosted)£25-27Assumes selective use of rate-limited tools
Total (Zapier)£47-50Higher but all-in-one solution

For a typical academic doing a literature review on a budget, self-hosting n8n with free Google Sheets storage keeps costs under £30 per month. Running twenty papers through this pipeline costs roughly £1.50 in API calls.