Legal due diligence report generation from contract documents

In-house counsel teams spend weeks buried in contract stacks, extracting compliance risks, liability clauses, and payment terms into sprawling spreadsheets and Word documents. Each contract requires manual reading, annotation, and synthesis into an executive summary that executives and stakeholders actually understand. The work is necessary but crushes productivity, and mistakes slip through when attention wanes at document fifty. This workflow automates the entire due diligence pipeline. You upload contracts (PDFs, images, or scanned documents) to a storage bucket, and the system extracts key risks, terms, and obligations, then compiles them into structured reports. No copy-pasting between tools. No rereading the same clause in three different documents. The entire cycle runs unattended, leaving your team to focus on judgment calls that actually require human expertise. The workflow combines Chat With PDF for intelligent document interaction, Okara AI for secure drafting of sensitive summaries, and Smmry for condensing lengthy provisions into readable insights. All orchestrated through n8n, which handles the entire sequence without requiring you to log into multiple tools.

The Automated Workflow

We'll use n8n as the orchestration layer because it offers the best balance of flexibility and native integrations for this particular stack. Zapier would work, but n8n gives you finer control over API payloads and error handling, which matters when dealing with sensitive contract data.

Step 1: Trigger and Document Upload

Start with a webhook trigger in n8n that watches a designated folder in Google Drive or Dropbox. When a new contract PDF lands there, the workflow fires automatically.

Webhook Event: file.created
Monitored folder: /Legal/Contracts/Incoming
File types: PDF, DOCX, JPEG, PNG

Configure the webhook to capture the file ID and name, then pass it to Chat With PDF via their API endpoint.

Step 2: Extract Contract Data with Chat With PDF

Chat With PDF's API allows you to upload a document and ask structured questions about its contents. Rather than asking open-ended questions, you'll send specific prompts designed to extract compliance-critical information.

[POST](/tools/post) /api/v1/chat/document { "document_id": "{{$node.trigger.data.file_id}}", "questions": [ "List all liability clauses and caps mentioned in this contract", "What are the termination conditions and notice periods?", "Identify all indemnification obligations", "What are the payment terms and conditions?", "Are there any non-compete or confidentiality restrictions?", "What insurance requirements are specified?" ], "model": "gpt-4o"
}

Store the responses in n8n's internal storage or a temporary database field. You'll have structured text answers to each prompt, which become the raw material for synthesis.

Step 3: Synthesise and Prioritise with Okara AI

Okara AI is designed for sensitive work with encryption and multi-model chat support. This step takes the extracted points and transforms them into a coherent narrative, prioritised by risk level. Okara's encryption is important here because contract summaries often contain confidential commercial terms. Use the API to send the extracted data as context, then ask Okara to draft a structured summary with a specific template:

POST /api/v1/chat/encrypted { "conversation_id": "{{$node.ChatWithPDF.data.document_id}}", "model": "claude-opus-4.6", "message": "Based on the following extracted contract provisions, draft an executive summary using this structure: [HIGH RISK ITEMS]
- List liability caps below $5M, broad indemnification, unusual termination clauses [MEDIUM RISK ITEMS]
- List payment contingencies, insurance gaps, notice period mismatches [COMPLIANCE NOTES]
- Any regulatory or data protection implications [KEY DATES]
- List material dates for renewal, payment, or termination Extracted data:
{{$node.ChatWithPDF.data.responses}} Format as markdown with clear sections and bullet points.", "encrypt": true
}

Okara will return a clean, structured summary. Store this output in your workflow's context.

Step 4: Condense Key Sections with Smmry

For particularly long provisions (warranty clauses, payment terms, indemnification sections), Smmry can reduce them to their essential meaning. This is optional but useful if contracts are over 15 pages.

POST /api/v1/summarise { "text": "{{$node.OkaraAI.data.compliance_notes}}", "num_sentences": 5, "return_format": "bullet_points"
}

This produces a bullet-point summary of each major section, which you'll append to the Okara output.

Step 5: Compile and Store the Final Report

Use n8n's Google Sheets or database node to write the final report to a structured table. Each row represents one contract, with columns for document name, high-risk items, medium-risk items, key dates, and compliance notes.

INSERT INTO contract_reports ( contract_name, upload_date, high_risk_items, medium_risk_items, compliance_notes, key_dates, full_summary, status
) VALUES ( '{{$node.trigger.data.file_name}}', NOW(), '{{$node.OkaraAI.data.high_risk}}', '{{$node.OkaraAI.data.medium_risk}}', '{{$node.Smmry.data.summary}}', '{{$node.OkaraAI.data.key_dates}}', '{{$node.OkaraAI.data.full_summary}}', 'READY_FOR_REVIEW'
)

Step 6: Notification and Review Handoff

Send a Slack or email notification to your legal team with a direct link to the report in Google Sheets or your database viewer. Include the risk level summary so team members know whether to prioritise this contract.

POST /slack/webhook/legal-contracts { "channel": "#contract-reviews", "text": "New contract report ready: {{$node.trigger.data.file_name}}", "blocks": [ { "type": "section", "text": { "type": "mrkdwn", "text": "*{{$node.trigger.data.file_name}}*\n\nHigh-risk items: {{$node.OkaraAI.data.risk_count}}\n\n<{{$node.database.report_url}}|View full report>" } } ]
}

The entire sequence runs in under two minutes per contract. Your team receives a structured, machine-drafted summary with clear risk flags and no need to re-read the source document unless they want to dive into specific clauses.

The Manual Alternative

If you prefer human review at each stage, use Chat With PDF's web interface directly. Upload the PDF, ask it the seven questions manually, copy the responses into a Google Doc, then pass them to Claude or ChatGPT for synthesis. This takes 15 to 20 minutes per contract and introduces copy-paste errors, but gives you absolute control over the prompts and questioning logic at each step. For smaller teams handling fewer than five contracts per week, this manual approach may be cheaper than setting up n8n orchestration. For more on this, see Legal contract review and client summary document generation.

Pro Tips

Error handling and document failures.

Chat With PDF can struggle with scanned documents that are heavily compressed or poorly OCR'd.

Add a pre-check step in n8n that validates the PDF's text layer before sending it to the API. If text extraction fails, flag the document for manual review rather than letting the workflow ghost it.

Rate limits and batch processing.

Chat With PDF has rate limits around 100 requests per minute. If you receive a large batch of contracts (say, 50 at once from a merger), stagger them through n8n using a delay node between each document. Okara and Smmry have generous limits, so they're rarely the bottleneck.

Model selection for cost and accuracy.

Use GPT-4o in Chat With PDF for standard contracts, but switch to o3 for particularly complex or atypical agreements (novel clauses, international law, sector-specific terms). o3 costs more but catches edge cases GPT-4o misses. Store a workflow variable that lets you toggle between models without rebuilding the entire automation.

Storing encrypted summaries.

Okara's built-in encryption is excellent, but consider adding a second layer: store the final report in an encrypted Google Sheet using Sheets' native encryption features, or use a dedicated legal document management system like NetDocuments or Relativity. Never dump contract summaries into unencrypted shared drives.

Feedback loop for model improvement.

After your legal team reviews the auto-generated summaries, capture their corrections and flagged items in a separate feedback sheet. Monthly, use this data to refine your Chat With PDF prompts. You'll discover which questions were ambiguous, which clauses the models consistently miss, and where you need to add new extraction points.

Cost Breakdown

Tool	Plan Needed	Monthly Cost	Notes
Chat With PDF by Copilot.us	API access	£40–100	Varies by monthly API calls; ~£0.02–0.05 per document depending on length
Okara AI	Professional (encrypted multi-model)	£80–150	Includes Claude Opus 4.6 and Sonnet 4.6; encryption standard across all plans
Smmry	API tier	£20–40	Low-cost tier sufficient for bulk summarisation tasks
n8n	Self-hosted or cloud tier	£0–150	Self-hosted is free; cloud Professional starts at £20/month for small workloads
Google Sheets or Airtable (report storage)	Free or Premium	£0–20	Free tier usually sufficient; Premium for advanced sharing and collaboration controls
Slack (notifications)	Standard or Pro	£5–12	Notification posting included in all paid tiers