Alchemy RecipeIntermediateworkflow

Academic paper research and literature review synthesis

Published

Academic researchers spend weeks extracting information from dozens of papers, manually copying passages into spreadsheets, then synthesising findings across documents. You read a paper in PDF format, switch to another tool to understand complex concepts, then copy summaries into a third application. Each handoff is a friction point, a place where focus breaks and time drains away............. For more on this, see Academic research paper summarisation and citation extrac.... For more on this, see Academic research synthesis and citation-ready literature....

This workflow becomes worse at scale. When you're managing 50 papers for a literature review, you face exponential manual effort. You might use ChatGPT's document upload feature for one paper, then switch to ExplainPaper for clarification on a specific concept, then paste findings into a summary document. The data never flows automatically between these steps, so you're constantly context-switching and re-entering information.

The solution is to automate the entire pipeline: ingest PDFs, extract key insights, clarify complex sections, and generate a structured literature review. With the right combination of AI tools and an orchestration platform, you can have a system that processes 20 papers overnight and delivers a synthesised summary ready for further analysis. This is the Alchemy approach, where we eliminate manual handoffs entirely.

The Automated Workflow

We'll use three complementary AI tools and an orchestration platform to build a zero-handoff literature review system. Here's how they fit together:

  • Chat with PDF (by Copilotus) extracts structured information from academic papers and answers specific questions about their content.

  • ExplainPaper breaks down dense technical sections into plain language.

  • Resoomer AI generates concise summaries and identifies key themes across multiple documents.

The orchestration layer (we'll show examples for both Zapier and n8n) ties these together, passing data forward without manual intervention.

Choosing Your Orchestration Tool

For this workflow, I recommend n8n if you need fine-grained control over data transformation, or Zapier if you want simplicity with minimal configuration. Make (Integromat) works well too, but n8n's JSON editor gives you better flexibility when reshaping data between API calls.

Setting Up the Core Workflow

Step 1: Trigger and PDF Input

The workflow begins when you add a new PDF to a folder or submit one via webhook. For this example, we'll use a simple HTTP webhook trigger that receives the PDF URL and paper metadata (title, authors, publication year).

In n8n, create a webhook node:


POST /litreview-intake
Content-Type: application/json

{
  "pdf_url": "https://example.com/paper.pdf",
  "title": "Deep Learning in Medical Imaging",
  "authors": "Smith et al.",
  "year": 2023
}

In Zapier, use the Webhooks by Zapier trigger with the same JSON structure. When a PDF arrives via this webhook, the workflow begins automatically.

Step 2: Extract Content with Chat with PDF

The Chat with PDF API accepts a document URL and processes it for querying. You'll send several structured prompts to extract different types of information: research questions, methodologies, key findings, and conclusions.

Here's the API call structure:


POST https://api.copilotus.com/v1/chat
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "document_url": "{{pdf_url}}",
  "query": "What is the main research question this paper addresses?"
}

In n8n, create an HTTP request node with this configuration. Then chain additional HTTP nodes for follow-up queries:


POST https://api.copilotus.com/v1/chat
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "document_url": "{{pdf_url}}",
  "query": "Summarise the methodology in 100 words or less"
}

And another for key findings:


POST https://api.copilotus.com/v1/chat
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "document_url": "{{pdf_url}}",
  "query": "List the main findings and conclusions"
}

The Chat with PDF API returns responses in this format:

{
  "answer": "The paper investigates how transformer-based models improve diagnostic accuracy in chest X-ray analysis compared to convolutional neural networks.",
  "confidence": 0.94,
  "source_pages": [1, 2]
}

Capture each response and store it in an intermediate object. In n8n's Set node, combine these into a structured record:

{
  "paper_id": "{{title}}_{{year}}",
  "title": "{{title}}",
  "authors": "{{authors}}",
  "year": {{year}},
  "research_question": "{{step2_response.answer}}",
  "methodology": "{{step3_response.answer}}",
  "findings": "{{step4_response.answer}}"
}

Step 3: Clarify Complex Concepts with ExplainPaper

Academic papers often contain dense technical passages. Rather than relying on a single extraction, we can send particularly challenging sections to ExplainPaper for detailed explanation. This step is conditional: it triggers only when the Chat with PDF confidence score is below 0.85, indicating uncertainty.

ExplainPaper's API accepts text passages and returns plain-language breakdowns:


POST https://api.explainpaper.com/v1/explain
Authorization: Bearer YOUR_EXPLAINPAPER_KEY
Content-Type: application/json

{
  "text": "{{complex_passage}}",
  "context": "academic paper on medical imaging",
  "detail_level": "intermediate"
}

The response structure looks like this:

{
  "original": "The utilisation of graph convolutional networks with attention mechanisms...",
  "explanation": "The model uses a type of neural network that works with connected data, combined with a focus system that weights important connections.",
  "key_concepts": [
    {"term": "graph convolutional networks", "definition": "..."},
    {"term": "attention mechanisms", "definition": "..."}
  ]
}

In n8n, add a conditional node (IF statement) that checks the Chat with PDF confidence:


{{step2_response.confidence}} < 0.85

If true, route to the ExplainPaper node. Merge the explanation back into your paper record:

{
  "...previous fields...",
  "clarification_needed": true,
  "clarification": "{{explainpaper_response.explanation}}"
}

Step 4: Synthesise Across Multiple Papers with Resoomer

Once you've processed individual papers, Resoomer AI combines multiple summaries and identifies common themes. This is where you move from individual paper analysis to literature review synthesis.

Resoomer's API accepts a batch of documents and returns thematic groupings:


POST https://api.resoomer.com/v1/synthesise
Authorization: Bearer YOUR_RESOOMER_KEY
Content-Type: application/json

{
  "documents": [
    {
      "paper_id": "smith_2023",
      "title": "Deep Learning in Medical Imaging",
      "content": "{{findings_from_chat_with_pdf}}"
    },
    {
      "paper_id": "jones_2023",
      "title": "Transformer Models for Healthcare",
      "content": "{{findings_from_second_paper}}"
    }
  ],
  "focus_areas": ["methodology", "findings", "applications"],
  "output_format": "thematic_summary"
}

The response provides grouped insights:

{
  "themes": [
    {
      "theme": "Deep Learning Architectures",
      "papers": ["smith_2023", "jones_2023"],
      "summary": "Both papers demonstrate improvements using modern neural network architectures, with transformer models showing particular promise.",
      "consensus_areas": ["accuracy improvements", "reduced training time"],
      "disputed_areas": ["computational cost trade-offs"]
    }
  ],
  "research_gaps": ["Limited discussion of real-world deployment costs"],
  "recommendations_for_further_study": [...]
}

Step 5: Store and Format Output

Finally, send the complete synthesis to a structured destination. Use Google Sheets for collaborative review, a database for archival, or email for immediate notification.

For Google Sheets via n8n:


{
  "title": "{{paper_title}}",
  "authors": "{{authors}}",
  "year": {{year}},
  "research_question": "{{research_question}}",
  "methodology": "{{methodology}}",
  "key_findings": "{{findings}}",
  "themes": "{{resoomer_themes}}",
  "clarifications_needed": "{{clarification}}",
  "processed_date": "{{now()}}"
}

Create a Google Sheets node in n8n that appends a new row with this data. If you're using Zapier, connect to Google Sheets via the built-in integration.

Alternatively, for a database like PostgreSQL or Airtable, use an Insert or Create Record node. Airtable example:


{
  "fields": {
    "Title": "{{paper_title}}",
    "Authors": "{{authors}}",
    "Year": {{year}},
    "Research Question": "{{research_question}}",
    "Methodology": "{{methodology}}",
    "Key Findings": "{{findings}}",
    "Themes Identified": "{{resoomer_themes}}",
    "Status": "Processed"
  }
}

Complete n8n Workflow Structure

Here's how the nodes connect in n8n:

  1. Webhook Trigger → receives PDF URL and metadata
  2. HTTP Request (Chat with PDF: Research Question)
  3. HTTP Request (Chat with PDF: Methodology)
  4. HTTP Request (Chat with PDF: Findings)
  5. Set Node → combines responses
  6. IF Condition → checks confidence score
  7. HTTP Request (ExplainPaper) → if confidence < 0.85
  8. Set Node → merges clarification
  9. Function Node → formats for Resoomer (only runs after processing 3+ papers)
  10. HTTP Request (Resoomer) → synthesises across papers
  11. Google Sheets Append → stores structured results

The entire workflow runs asynchronously. Once triggered, it completes in 30 seconds to 2 minutes depending on API response times.

The Manual Alternative

If you prefer more control at each step, you can run this workflow semi-automatically. Upload a PDF to Chat with PDF and review the extracted information. Copy sections you want clarified, paste them into ExplainPaper, and review the explanations. Once you've processed all papers, manually compile findings and send the list to Resoomer for synthesis.

This approach keeps you informed at each stage but requires your active input. It's useful when papers are highly specialised or when you need to make judgment calls about what constitutes a key finding. The trade-off is that you lose the "zero manual handoff" benefit and must manage the context-switching yourself.

For smaller literature reviews (5-10 papers), this semi-manual approach might actually be faster than setting up the full automation. For reviews exceeding 15 papers, the automated workflow pays for itself in time savings.

Pro Tips

Rate Limiting and Batching

Chat with PDF typically allows 30 requests per minute on free plans. If you're processing many papers, stagger requests using n8n's Rate Limit node or add delays between API calls. Set a 2-second pause between each Chat with PDF request to avoid hitting limits.


{
  "type": "rate_limit",
  "limit": 30,
  "window": "1 minute"
}

Handling Failed API Calls

Add error handling to gracefully manage timeouts and API unavailability. In n8n, attach a "Catch" node to your HTTP request nodes. If Chat with PDF fails, store the paper metadata and send a notification so you can retry manually.


{
  "error": true,
  "paper_id": "{{paper_id}}",
  "failed_step": "Chat with PDF extraction",
  "error_message": "{{error.message}}",
  "retry_url": "{{webhook_url}}"
}

Customising Extraction Prompts

The quality of Chat with PDF responses depends on your prompts. Rather than generic queries, specify exactly what information matters for your review. Instead of "Summarise the paper", ask "What statistical methods did the authors use, and what were the reported confidence intervals?"

Store prompt templates in n8n as environment variables:


PROMPT_METHODOLOGY: "Describe the methodology in detail. Include sample size, data sources, statistical tests, and any limitations mentioned by the authors."

PROMPT_FINDINGS: "List the main findings with effect sizes or key metrics. Highlight any surprising results that contradict prior research."

PROMPT_LIMITATIONS: "What limitations did the authors acknowledge? Were there methodological constraints or applicability limitations?"

Cost Optimisation

Chat with PDF charges per document processed, not per query. If you make five queries per document, the cost is the same as one query (typically $0.05-0.10 per document). However, ExplainPaper and Resoomer charge per unit of content analysed. Minimise calls by batching clarifications. Only send passages to ExplainPaper if the initial extraction has low confidence.

Incremental Synthesis

Rather than waiting until you've processed all papers, run Resoomer after every 5-10 papers. This gives you early visibility into emerging themes and helps you refine search terms for additional papers. Store intermediate synthesis results in a database so you can track how themes evolve as you add more papers.

Cost Breakdown

ToolPlan NeededMonthly CostNotes
Chat with PDF by CopilotusPay-as-you-go£3–15£0.05–0.10 per document; one-time charge regardless of query count
ExplainPaperFree or Pro£0–12Free for 10 documents/month; Pro at £9/month for 500 documents
Resoomer AIStandard or Plus£0–8Free tier includes 5 syntheses/month; Plus at £8/month for 100+
n8n CloudFree or Starter£0–10Free tier includes 1,000 executions/month; Starter at £10/month for 10,000 executions
ZapierFree or Premium£0–99Free tier adequate for < 50 papers/month; Premium plans for higher volume
Google Sheets/AirtableFree£0No additional cost if using free tiers; Airtable Standard at £6/month for larger bases

For a typical academic doing one literature review per month across 30 papers, expect total monthly costs of £15–30 across all tools. Most of this is the per-document charge from Chat with PDF. For more on this, see Academic literature review synthesis from research papers.

More Recipes