Introduction
Academic researchers spend enormous amounts of time on a task that is fundamentally repetitive: reading papers, extracting key findings, organising citations, and synthesising information into coherent literature reviews. You might spend three hours reading a single dense PDF, another hour taking notes, then a further two hours cross-referencing citations and formatting them correctly. Multiply that across twenty or fifty papers, and you are looking at weeks of manual work that could be compressed into days.
The real problem is not any single tool's shortfall. Chat-with-PDF tools can answer questions about individual papers. Summarisation tools can condense lengthy abstracts. Citation managers can track references. But moving information between these tools requires manual copying, pasting, and reformatting. Every handoff introduces friction, inconsistency, and the risk of lost context.
This workflow eliminates those friction points entirely. By wiring together a PDF chat interface, a paper explanation service, and a summarisation engine within an orchestration platform, you create a system that takes a folder of research papers and outputs a structured, citation-ready literature review with zero manual handoff. You send the papers in; formatted findings come out.
The Automated Workflow
Choosing Your Orchestration Platform
For this workflow, n8n offers the best balance of ease-of-use and capability. It runs on your own infrastructure or through their hosted cloud service, integrates directly with webhook-based APIs, and includes built-in HTTP request nodes that let you authenticate and communicate with third-party services. Make (Integromat) is a viable alternative if you prefer a fully cloud-hosted solution, though its interface can feel less intuitive for complex multi-step workflows. Zapier works, but its reliance on pre-built integrations means you will hit limitations if these tools do not have direct Zapier connectors.
We will build this workflow in n8n. If you prefer Claude Code, you can achieve the same result in a single Python script; that approach works well for one-off runs or smaller batch jobs. If you use Zapier or Make, the principles remain identical; only the interface changes.
The Data Flow
Here is what happens from start to finish:
-
A new PDF arrives in a cloud folder (Google Drive, OneDrive, or Dropbox).
-
Your orchestration tool detects the new file and downloads it.
-
The PDF is sent to Chat-with-PDF (via Copilotus) with a structured prompt asking for key findings, methodology, and conclusions.
-
The same PDF goes to ExplainPaper, requesting a simplified explanation of the main claims and evidence.
-
A summarisation request is sent to SMMRY to condense the extracted information further.
-
All three responses are combined, formatted, and appended to a shared Google Doc or sent as a structured JSON file.
-
Citations are automatically extracted and formatted in a standard citation format (APA, Harvard, or Chicago).
-
A notification is sent to you indicating the paper has been processed and the review is ready.
Setting Up the Workflow in n8n
First, create a free n8n account and set up a self-hosted instance or use the cloud version. You will need API keys from each service.
Step 1: File Trigger
Add a Google Drive trigger node. Configure it to monitor a specific folder for new PDFs.
{
"resource": "file",
"operation": "watch",
"driveId": "your-google-drive-id",
"folderId": "your-folder-id",
"pollInterval": 300
}
When a new PDF appears, the workflow fires automatically.
Step 2: Download the PDF
Add a Google Drive node to download the file. This retrieves the binary content of the PDF, which you will pass to downstream services.
{
"resource": "file",
"operation": "download",
"fileId": "{{ $json.id }}"
}
Store the binary output in a variable; you will reference it multiple times.
Step 3: Extract Text and Send to Chat-with-PDF
Chat-with-PDF (Copilotus) does not have a standard REST API, but you can interact with it via a webhook or by embedding calls within an HTTP request node. The recommended approach is to use their API endpoint if available, or to configure a webhook that receives the PDF URL.
If Copilotus provides an API endpoint, your HTTP request node will look like this:
POST https://api.copilotus.chat/v1/documents/upload
{
"file": "<binary PDF content>",
"apiKey": "your-copilotus-api-key",
"questions": [
"What are the key findings of this paper?",
"What methodology did the authors use?",
"What are the main conclusions and limitations?"
]
}
The response will contain answers to your questions. Store this output in a variable called chatWithPdfResponse.
Step 4: Send to ExplainPaper
ExplainPaper provides an API that accepts PDF URLs or document IDs. If you have uploaded the PDF to a cloud storage service with a public URL, you can pass that directly. Otherwise, convert the binary to base64 and send it as a data URI.
POST https://api.explainpaper.com/v1/documents/explain
{
"document_url": "{{ $json.pdf_url }}",
"api_key": "your-explainpaper-key",
"focus": "main_claims_and_evidence",
"output_format": "structured"
}
Store the result in explainPaperResponse.
Step 5: Summarise with SMMRY
SMMRY is a straightforward summarisation service. Extract the text from the PDF (you can do this with a dedicated PDF-to-text node or via a text extraction library), then send it to SMMRY.
POST https://api.smmry.com/SM_API
{
"sm_api_input": "<extracted PDF text>",
"SM_LENGTH": 5,
"api_key": "your-smmry-key"
}
The SM_LENGTH parameter controls output length; 5 is a reasonable starting point for a single paragraph summary. Store the response in smmryResponse.
Step 6: Combine and Format Responses
Add a function node to merge all three responses into a single structured document. This is where you create the citation-ready format.
const chatResponse = $json.chatWithPdfResponse.content;
const explainResponse = $json.explainPaperResponse.explanation;
const summaryResponse = $json.smmryResponse.sm_api_output;
const combinedReview = {
paper_title: $json.pdf_title,
source_file: $json.pdf_name,
extracted_date: new Date().toISOString(),
summary: summaryResponse,
key_findings: chatResponse,
simplified_explanation: explainResponse,
citation: {
format: "Harvard",
text: `${$json.pdf_author} (${$json.pdf_year}). ${$json.pdf_title}. ${$json.pdf_journal}.`
}
};
return combinedReview;
This produces a clean JSON structure that can be appended to a spreadsheet, database, or document.
Step 7: Write to Google Docs or Sheets
Add a Google Sheets node to append the structured data. Each row represents one paper; columns contain the title, summary, key findings, explanation, and citation.
Alternatively, use a Google Docs node to append to a formatted document. This creates a more readable output suitable for sharing with colleagues.
{
"resource": "document",
"operation": "append",
"documentId": "your-google-doc-id",
"content": {
"text": "{{ $json.combinedReview.paper_title }}\n\n{{ $json.combinedReview.summary }}\n\nKey Findings:\n{{ $json.combinedReview.key_findings }}\n\nCitation:\n{{ $json.combinedReview.citation.text }}"
}
}
Step 8: Send Notification
Add a Slack or email notification node to alert you when the workflow completes.
POST https://hooks.slack.com/services/YOUR/WEBHOOK/URL
{
"text": "Literature review for {{ $json.pdf_title }} is ready",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "Paper: *{{ $json.pdf_title }}*\nSummary: {{ $json.combinedReview.summary }}"
}
}
]
}
Save the workflow, activate it, and test with a single PDF. Once it runs successfully, you can upload batches of papers and let the system process them overnight.
The Manual Alternative
If you prefer more control and do not want to set up an orchestration platform, you can run this workflow as a Python script locally or within Claude Code.
Create a script that iterates through PDFs in a folder, calls each API sequentially, and appends results to a CSV or JSON file. This approach is slower than automation but requires no ongoing infrastructure and gives you complete control over every step.
import requests
import os
import json
from datetime import datetime
papers = [f for f in os.listdir('./papers') if f.endswith('.pdf')]
results = []
for paper in papers:
pdf_path = f'./papers/{paper}'
# Chat-with-PDF
chat_response = requests.post(
'https://api.copilotus.chat/v1/documents/upload',
headers={'Authorization': f'Bearer {COPILOTUS_KEY}'},
files={'file': open(pdf_path, 'rb')}
).json()
# ExplainPaper
explain_response = requests.post(
'https://api.explainpaper.com/v1/documents/explain',
json={'document_path': pdf_path},
headers={'Authorization': f'Bearer {EXPLAINPAPER_KEY}'}
).json()
# SMMRY
with open(pdf_path, 'r') as f:
text = f.read()
smmry_response = requests.post(
'https://api.smmry.com/SM_API',
data={'sm_api_input': text, 'SM_LENGTH': 5}
).json()
# Combine
result = {
'title': paper,
'summary': smmry_response.get('sm_api_output'),
'findings': chat_response.get('answers'),
'explanation': explain_response.get('explanation'),
'processed_at': datetime.now().isoformat()
}
results.append(result)
with open('literature_review.json', 'w') as f:
json.dump(results, f, indent=2)
print(f"Processed {len(results)} papers")
Run this script whenever you have a batch of papers. It takes 2-5 minutes per paper depending on file size and API response times.
Pro Tips
Rate Limit Management
Each service has rate limits. Chat-with-PDF typically allows 10-20 requests per minute on free plans. SMMRY allows 200 per day. Add delays between API calls within your n8n workflow to avoid hitting limits.
// In n8n, add a delay node between each API request
{
"delayMs": 3000 // 3 second delay
}
If you are processing more than 50 papers regularly, consider upgrading to paid plans or staggering processing across multiple days.
Error Handling
PDFs with embedded images, scanned documents, or unusual formatting may fail. Add try-catch logic to your orchestration to skip problematic files and log them separately.
In n8n, use a Try-Catch node around your Chat-with-PDF and ExplainPaper requests. When an error occurs, capture the file name and error message to a spreadsheet so you can review it manually later.
Cost Optimisation
SMMRY is free for up to 200 summaries per day. Chat-with-PDF by Copilotus charges per upload on the free tier; their Pro plan ($9.99/month) offers unlimited uploads. ExplainPaper is $9.99/month for unlimited access. n8n is free for self-hosted deployments but charges for cloud hosting ($20/month and up).
For most researchers, total monthly cost sits at £15-25. Run the workflow during off-peak hours if your orchestration platform charges by execution time.
Citation Extraction
The workflow above includes a basic citation template. For greater accuracy, integrate a dedicated citation extraction service like CrossRef API, which can look up DOIs and return formatted citations automatically.
Add this step after extracting the paper's DOI:
GET https://api.crossref.org/works/{DOI}
The response includes author, title, publication year, and journal. Format these into any citation standard you need.
Testing with Small Batches
Before running against your entire research library, test the workflow with 3-5 papers of different types: journal articles, conference papers, preprints, and older PDFs scanned at low resolution. This reveals which papers your setup cannot handle and lets you adjust prompts or settings before scaling up.
Cost Breakdown
| Tool | Plan Needed | Monthly Cost | Notes |
|---|---|---|---|
| Chat-with-PDF (Copilotus) | Pro | £7.99 | Unlimited uploads and API calls |
| ExplainPaper | Pro | £7.99 | Required for API access; free tier is web-only |
| SMMRY | Free | £0.00 | 200 summaries per day; paid tier for higher volume |
| n8n (Cloud) | Starter | £20.00 | Self-hosted option is free; cloud includes 1000 workflow executions/month |
| n8n (Self-Hosted) | N/A | £0.00 | Free but requires server infrastructure |
| Google Drive / Docs / Sheets | Free | £0.00 | Included with Google Workspace or free tier |
| Total (Cloud Setup) | £35.98 | Per month | |
| Total (Self-Hosted Setup) | £15.98 | Per month |
If you process 50 papers per month, the cost per paper is less than £0.72 using the cloud setup or £0.32 using self-hosted n8n. Compare this to the 5-7 hours a researcher would spend manually, and the automation pays for itself within the first week.