Back to Alchemy
Alchemy RecipeBeginnerautomation

Manufacturing quality inspection report generation from shop floor photos

28 March 2026

Introduction Factory floors generate hundreds of quality inspection photos each day, yet supervisors still spend their afternoons manually writing up findings into compliance reports.

A supervisor takes a photo of a faulty component, notes some observations aloud, then sits at a desk typing out standardised forms that repeat the same categories every time. The whole process wastes skilled labour on documentation that should be generated automatically. The gap exists because quality inspection data lives in photos and voice notes, whilst compliance systems expect structured text. Getting from one to the other requires several steps: image analysis to spot defects, transcription of verbal notes, data extraction into a consistent format, and finally, report generation. Most factories either skip automation entirely or build brittle custom solutions that break when someone changes the form template. This workflow solves that problem by connecting computer vision, transcription, and document generation in a single automated chain. A supervisor photographs a faulty part, records a quick voice memo describing what went wrong, and a completed compliance report appears in their system within seconds. No typing required.

The Automated Workflow Choosing an orchestration tool For this workflow, n8n is the best choice.

It runs on your own infrastructure (or cloud hosting), handles image and audio files without complaint, and connects directly to OpenAI APIs without rate-limiting surprises. Zapier would work but charges per task, and Make's file handling is slower for large batches of inspection photos. n8n costs nothing to self-host and scales to hundreds of daily inspections without additional fees. Data flow overview The workflow runs in this sequence: 1. A supervisor uploads a photo and voice memo to a shared folder (Google Drive, Dropbox, or local storage) 2. n8n detects the new files and downloads them 3. GPT-4o analyses the photo and extracts defect details 4. Whisper API transcribes the voice memo 5. Claude Opus 4.6 combines both the image analysis and transcription into a structured inspection record 6. Deepnote generates a formatted compliance report using that structured data 7. The finished report is saved to the inspection archive and a Slack notification alerts the supervisor Setting up n8n Install n8n on a server or use their cloud hosting. Create a new workflow with these nodes in order. Node 1: File trigger Start with a Google Drive trigger to watch for new files in your quality inspection folder.

Trigger Type: Google Drive
Event: File created
Folder: /Quality Inspections
File types: Images (.jpg, .png), Audio (.m4a, .wav)

Node 2: Download files Use the Google Drive download node to fetch each file locally for processing.

Input: File ID from trigger
Output: Binary file data

Node 3: Image analysis with GPT-4o Send the photo to OpenAI's vision API. This extracts defect type, severity, location, and other structured observations.

POST https://api.openai.com/v1/chat/completions { "model": "gpt-4o", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Analyse this factory floor quality inspection photo. Extract: defect type, severity (1-5 scale), location on component, root cause hypothesis, and recommended action. Return as JSON only." }, { "type": "image_url", "image_url": { "url": "data:image/jpeg;base64,{{ $node.GoogleDrive.binary.data }}" } } ] } ], "temperature": 0.3, "max_tokens": 500
}

Set temperature to 0.3 to ensure consistent, reliable analysis. The response will be structured JSON like this:

json
{ "defect_type": "Surface scratch", "severity": 3, "location": "Top left corner of housing", "root_cause": "Abrasive contact during packaging", "recommended_action": "Adjust padding in packaging materials"
}

Node 4: Transcribe audio with Whisper API Convert the voice memo to text. You can use Whisper API directly or OpenAI's standard Whisper endpoint.

POST https://api.openai.com/v1/audio/transcriptions Headers:
Authorization: Bearer YOUR_OPENAI_API_KEY Body (form-data):
file: [audio file from Node 1]
model: whisper-1
language: en
temperature: 0

The response includes the transcribed text:

json
{ "text": "This is a surface defect on the housing. Looks like it happened during the moulding process. We should check the tooling for wear."
}

Node 5: Combine analysis with Claude Opus 4.6 Use Claude to merge the GPT-4o image analysis and Whisper transcription into a single structured inspection record. Claude is better at combining multiple data sources into coherent formats.

POST https://api.anthropic.com/v1/messages { "model": "claude-opus-4.6", "max_tokens": 1024, "messages": [ { "role": "user", "content": "You have an image analysis result and a voice memo transcription from a factory quality inspection. Combine them into a single structured inspection record with these fields: summary, severity, root_cause, supervisor_notes, recommended_corrective_action. Use the voice memo to add context and the image analysis for technical details.\n\nImage Analysis:\n{{ $node.GPT4o.json.defect_type }}\n{{ $node.GPT4o.json.severity }}\n{{ $node.GPT4o.json.location }}\n\nVoice Memo Transcription:\n{{ $node.Whisper.json.text }}\n\nReturn as JSON only." } ]
}

Node 6: Generate report in Deepnote Deepnote notebooks can accept HTTP requests via webhooks. Create a notebook with a report template that accepts JSON from n8n, populates a formatted compliance document, and saves it.

Webhook URL: https://your-deepnote-workspace.deepnote.com/webhooks/inspection-report Payload:
{ "inspection_id": "{{ $node.GoogleDrive.context.fileName }}", "timestamp": "{{ now }}", "defect_summary": "{{ $node.Claude.json.summary }}", "severity": "{{ $node.Claude.json.severity }}", "location": "{{ $node.Claude.json.location }}", "root_cause": "{{ $node.Claude.json.root_cause }}", "supervisor_notes": "{{ $node.Claude.json.supervisor_notes }}", "corrective_action": "{{ $node.Claude.json.recommended_corrective_action }}"
}

In your Deepnote notebook, accept this payload and render it using Python:

python
from datetime import datetime
from jinja2 import Template inspection_data = webhook_payload # Automatically available in Deepnote webhooks report_template = """
QUALITY INSPECTION REPORT
Date: {{ timestamp }}
Inspection ID: {{ inspection_id }} DEFECT SUMMARY
{{ defect_summary }} SEVERITY LEVEL: {{ severity }}/5 LOCATION
{{ location }} ROOT CAUSE ANALYSIS
{{ root_cause }} SUPERVISOR NOTES
{{ supervisor_notes }} CORRECTIVE ACTION REQUIRED
{{ corrective_action }} Report generated automatically via Deepnote and n8n.
""" template = Template(report_template)
report_html = template.render(inspection_data) # Save to file system or send to storage
with open(f"reports/inspection_{inspection_id}.html", "w") as f: f.write(report_html)

Node 7: Notify supervisor Send a Slack message with a link to the completed report.

POST https://hooks.slack.com/services/YOUR/WEBHOOK/URL { "text": "Quality inspection report ready", "blocks": [ { "type": "section", "text": { "type": "mrkdwn", "text": "*New Inspection Report Generated*\n\nDefect: {{ $node.Claude.json.summary }}\nSeverity: {{ $node.Claude.json.severity }}/5\n\n<https://your-deepnote-link/reports/{{ $node.GoogleDrive.context.fileName }}.html|View Report>" } } ]
}

Error handling Add error catch nodes after GPT-4o and Claude steps. If image analysis fails (e.g. blurry photo), send an alert to the supervisor asking them to retake the photo rather than generating a report from incomplete data.

If GPT-4o response contains "unable to identify" or "image unclear": → Send Slack notification: "Photo quality issue. Please retake the inspection photo." → Mark inspection as failed and pause workflow

The Manual Alternative If you prefer not to set up n8n, you can process inspections semi-manually using Claude Code.

After uploading a photo and recording a voice memo, open Claude with the Claude Code feature, paste your photo, and ask it to: analyse the defect, accept a text transcript of your voice notes, then generate a compliance report combining both. This takes 2-3 minutes per inspection instead of 30 seconds automatically, but requires no server setup. Use this approach if you're running fewer than 5 inspections per day or want to maintain tighter control over each report before it's filed.

Pro Tips Rate limit strategy for high-volume inspections If your factory processes 50+ inspections daily, GPT-4o's rate limits become a bottleneck.

Configure n8n to batch requests using the Batch node, sending 5 photos to GPT-4o simultaneously rather than one at a time. This spreads the rate limit across parallel processing and cuts total runtime by 60 percent. Cost control: use GPT-4o mini for images GPT-4o costs 0.015 USD per image. GPT-4o mini costs 0.00075 USD per image and performs nearly identically for defect detection. Switch to GPT-4o mini for all image analysis unless you're dealing with complex defects requiring expert-level reasoning. The savings add up quickly at scale. Voice memo pre-processing If supervisors record voice memos in noisy factory environments, add an audio enhancement step. Use Deepnote to apply noise reduction before sending to Whisper. This improves transcription accuracy by 15-20 percent and costs nothing extra. Validate Claude's output format Before feeding Claude's structured JSON into Deepnote, use n8n's validation node to ensure all required fields are present. If a field is missing or malformed, trigger a second Claude pass to fix it rather than letting broken reports reach the archive. Version compliance templates Store your compliance report template as a version-controlled document in Git. Update Deepnote's template logic from Git on a schedule, ensuring all supervisors use the latest regulatory template without manual coordination.

Cost Breakdown

ToolPlan NeededMonthly CostNotes
n8nSelf-hosted or Pro Cloud£0–£25Self-hosted is free; cloud Pro is £25/month for 200K executions
GPT-4oOpenAI API pay-as-you-go£0.75–£30.015 USD per image; 50 inspections per day costs ~£22/month
GPT-4o miniOpenAI API pay-as-you-go£0.04–£0.150.00075 USD per image; recommended for high volume
Whisper APIOpenAI API pay-as-you-go£0.50–£20.02 USD per minute of audio; typical memo is 30 seconds
Claude Opus 4.6Anthropic API pay-as-you-go£0.50–£1.500.015 USD per 1K input tokens; used once per inspection for synthesis
DeepnoteFree or Pro£0–£12Free tier includes webhook access and basic notebook execution
SlackFree or Pro£0–£8Free plan sufficient for notifications; Pro for advanced features
Total monthly£2–£52Scales linearly with inspection volume; 50 inspections/day = ~£30/month