Back to Alchemy
Alchemy RecipeIntermediateautomation

Bill of materials extraction from product images

24 March 2026

Introduction

Extracting structured data from product images is a common pain point in e-commerce, inventory management, and procurement workflows. You've got hundreds or thousands of product photos, and you need to pull out specifications, materials, dimensions, and component lists. Doing this manually is tedious and error-prone; automating it means you can process batches overnight and have clean data ready in your database by morning.

The challenge is that no single AI tool does this job perfectly. You need an image recognition system that understands products, a specifications parser that extracts technical details, and something to translate messy extracted text into structured formats. That's where workflow automation comes in. By combining Accio AI for initial product analysis, ParSpec AI for detailed specification extraction, and PDnob Image Translator AI for format conversion, you can build an end-to-end pipeline that takes raw images and outputs a clean bill of materials. This guide walks you through building that pipeline with zero manual intervention between steps.

The Automated Workflow

Understanding the Data Flow

Before we write any code, let's map out what happens at each stage. You upload or trigger a batch of product images. Accio AI analyses the image and identifies the product type and general components. ParSpec AI then takes that analysis and extracts specific technical specifications and material information. Finally, PDnob translates the extracted text into a structured format that your system can consume. Throughout, we use an orchestration tool to pass data between each step without anyone touching a keyboard.

The workflow looks like this: Image Input → Accio AI Analysis → ParSpec AI Extraction → PDnob Translation → Structured Output (JSON, CSV, or database insertion).

Choosing Your Orchestration Tool

For this particular workflow, here's how the three main options stack up:

Zapier is the easiest to set up if you're not technical. It has visual workflows, and you can often avoid writing code. The trade-off is cost; it charges per task, so processing 1,000 images gets expensive quickly.

n8n gives you more control and is self-hosted, so you pay once for infrastructure rather than per execution. It's better for high-volume workflows. The learning curve is steeper, but the visual editor is intuitive once you understand the basics.

Make (formerly Integromat) sits in the middle. It's cloud-based like Zapier but offers more customisation through scenarios. It's cheaper than Zapier at scale but pricier than self-hosted n8n.

Claude Code is worth mentioning if you're comfortable with Python or JavaScript. You can write a custom script that calls each API sequentially, giving you maximum flexibility. However, you'll need to handle infrastructure yourself (Lambda functions, containers, or a dedicated server).

For this guide, we'll show examples using n8n, as it offers the best balance of flexibility and cost for high-volume image processing. The API calls themselves, however, are tool-agnostic, so you can adapt the approach to Zapier or Make.

Step 1:

Setting Up the Image Trigger

Your workflow begins with images arriving somewhere. This could be an S3 bucket, an email inbox, a Slack channel, or a webhook. For this example, we'll assume you're uploading images to a folder that's monitored by your orchestration tool.

In n8n, you'd use the "HTTP Request" node or an S3 node to monitor for new files. Here's a sample webhook trigger:

{
  "webhook_url": "https://your-n8n-instance.com/webhook/product-images",
  "method": "POST",
  "body": {
    "image_url": "https://example.com/product-001.jpg",
    "product_id": "SKU-12345",
    "batch_id": "batch-2024-001"
  }
}

The key fields are image_url (which Accio will need), product_id (to track the data back), and batch_id (for grouping related items). Your client systems POST to this webhook whenever a new product image is ready for processing.

Step 2:

Accio AI Image Analysis

Accio AI performs the initial computer vision analysis. It identifies what's in the product image and returns component information. Here's the API call:

curl -X POST https://api.accio-ai.com/v1/analyse-product \
  -H "Authorization: Bearer YOUR_ACCIO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/product-001.jpg",
    "product_type": "electronics",
    "return_components": true
  }'

The response looks like this:

{
  "product_id": "SKU-12345",
  "product_type": "electronics",
  "confidence": 0.94,
  "detected_components": [
    "circuit board",
    "metal casing",
    "plastic connector",
    "copper wire"
  ],
  "material_hints": [
    "aluminum",
    "silicon",
    "polyurethane"
  ],
  "analysis_id": "accio-2024-001"
}

In n8n, you'd add an HTTP Request node configured like this:

{
  "url": "https://api.accio-ai.com/v1/analyse-product",
  "method": "POST",
  "headers": {
    "Authorization": "Bearer <%= $env.ACCIO_API_KEY %>",
    "Content-Type": "application/json"
  },
  "body": {
    "image_url": "{{ $node['Webhook'].json.image_url }}",
    "product_type": "electronics",
    "return_components": true
  }
}

The {{ }} syntax is n8n's templating; it inserts the image URL from the webhook trigger. Store the response in a variable for the next step.

Step 3:

ParSpec AI Specification Extraction

ParSpec AI takes the component list from Accio and extracts detailed technical specifications. It's particularly good at pulling out material grades, dimensions, and performance metrics. The API call:

curl -X POST https://api.parspec-ai.com/v1/extract-specifications \
  -H "Authorization: Bearer YOUR_PARSPEC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/product-001.jpg",
    "detected_components": ["circuit board", "metal casing", "plastic connector", "copper wire"],
    "extract_materials": true,
    "extract_dimensions": true,
    "extract_electrical_properties": true
  }'

The response contains structured specification data:

{
  "specifications": [
    {
      "component": "circuit board",
      "material": "FR-4 fiberglass epoxy",
      "thickness_mm": 1.6,
      "layer_count": 4,
      "copper_weight_oz": 1
    },
    {
      "component": "metal casing",
      "material": "aluminium 6061-T6",
      "thickness_mm": 2.0,
      "finish": "anodized",
      "colour": "black"
    },
    {
      "component": "plastic connector",
      "material": "polycarbonate ABS",
      "tensile_strength_mpa": 55,
      "operating_temperature_celsius": "−20 to 80"
    },
    {
      "component": "copper wire",
      "material": "99.9% pure copper",
      "gauge_awg": 24,
      "insulation": "PVC"
    }
  ],
  "extraction_confidence": 0.87,
  "parspec_id": "parspec-2024-001"
}

In n8n, add another HTTP Request node:

{
  "url": "https://api.parspec-ai.com/v1/extract-specifications",
  "method": "POST",
  "headers": {
    "Authorization": "Bearer <%= $env.PARSPEC_API_KEY %>",
    "Content-Type": "application/json"
  },
  "body": {
    "image_url": "{{ $node['Webhook'].json.image_url }}",
    "detected_components": "{{ $node['Accio AI'].json.detected_components }}",
    "extract_materials": true,
    "extract_dimensions": true,
    "extract_electrical_properties": true
  }
}

Chain the output of the Accio node into ParSpec's input. This ensures you're passing real, analysed data rather than guessing at specifications.

Step 4:

PDnob Image Translator AI Format Conversion

PDnob takes the messy extracted specifications and translates them into a clean, normalised format. This is crucial because different images may have different metadata structures or text formatting issues. PDnob standardises everything:

curl -X POST https://api.pdnob-image-translator-ai.com/v1/translate-to-bom \
  -H "Authorization: Bearer YOUR_PDNOB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "specifications": [
      {
        "component": "circuit board",
        "material": "FR-4 fiberglass epoxy",
        "thickness_mm": 1.6
      }
    ],
    "output_format": "json",
    "normalize_units": true,
    "validate_material_codes": true
  }'

The response is a properly structured bill of materials:

{
  "bill_of_materials": [
    {
      "part_number": "PCB-FR4-1.6",
      "description": "FR-4 fiberglass epoxy circuit board, 1.6mm thickness",
      "material_code": "FR4",
      "material_name": "Fiberglass-reinforced epoxy resin",
      "quantity": 1,
      "unit": "piece",
      "specifications": {
        "thickness_mm": 1.6,
        "layer_count": 4,
        "copper_weight_oz": 1
      }
    },
    {
      "part_number": "CASE-AL6061",
      "description": "Aluminium 6061-T6 casing, anodized black, 2.0mm thickness",
      "material_code": "AL6061",
      "material_name": "Aluminium 6061-T6",
      "quantity": 1,
      "unit": "piece",
      "specifications": {
        "thickness_mm": 2.0,
        "finish": "anodized",
        "colour": "black"
      }
    }
  ],
  "product_id": "SKU-12345",
  "bom_id": "BOM-2024-001",
  "created_at": "2024-01-15T10:30:00Z"
}

In n8n:

{
  "url": "https://api.pdnob-image-translator-ai.com/v1/translate-to-bom",
  "method": "POST",
  "headers": {
    "Authorization": "Bearer <%= $env.PDNOB_API_KEY %>",
    "Content-Type": "application/json"
  },
  "body": {
    "specifications": "{{ $node['ParSpec AI'].json.specifications }}",
    "output_format": "json",
    "normalize_units": true,
    "validate_material_codes": true
  }
}

Step 5:

Storing Results

The final step writes the bill of materials to your database or file storage. If you're using n8n, add a node that posts to your API:

{
  "url": "https://your-api.example.com/api/bills-of-materials",
  "method": "POST",
  "headers": {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
  },
  "body": {
    "product_id": "{{ $node['Webhook'].json.product_id }}",
    "batch_id": "{{ $node['Webhook'].json.batch_id }}",
    "bom": "{{ $node['PDnob'].json.bill_of_materials }}",
    "accio_analysis_id": "{{ $node['Accio AI'].json.analysis_id }}",
    "parspec_analysis_id": "{{ $node['ParSpec AI'].json.parspec_id }}",
    "pdnob_bom_id": "{{ $node['PDnob'].json.bom_id }}"
  }
}

Alternatively, if you prefer CSV output, use PDnob's CSV export option:

{
  "output_format": "csv",
  "csv_columns": ["part_number", "description", "material_code", "quantity", "unit"]
}

Then save the CSV to S3 or your file storage:

{
  "method": "PUT",
  "url": "s3://your-bucket/bills-of-materials/{{ $node['Webhook'].json.product_id }}_{{ now.toFormat('yyyy-MM-dd HH:mm') }}.csv",
  "body": "{{ $node['PDnob'].json.csv_output }}"
}

The Manual Alternative

If you prefer more control over individual steps, or if you're prototyping before automating, you can call each API manually and review outputs before proceeding.

Start by uploading your image and calling Accio:

curl -X POST https://api.accio-ai.com/v1/analyse-product \
  -H "Authorization: Bearer YOUR_ACCIO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "s3://your-bucket/products/image.jpg",
    "product_type": "electronics",
    "return_components": true
  }'

Review the detected_components array. If it's missing something obvious, you can manually edit the list and pass it to ParSpec:

curl -X POST https://api.parspec-ai.com/v1/extract-specifications \
  -H "Authorization: Bearer YOUR_PARSPEC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "s3://your-bucket/products/image.jpg",
    "detected_components": ["circuit board", "metal casing", "plastic connector", "copper wire", "led indicator"],
    "extract_materials": true,
    "extract_dimensions": true,
    "extract_electrical_properties": true
  }'

Once you're satisfied with ParSpec's output, pass it to PDnob for final formatting. This approach lets you catch errors early without waiting for the full automated pipeline to finish.

Pro Tips

Implement retry logic with exponential backoff. API calls occasionally fail due to network issues or temporary service outages. In n8n, use the "Retry" node settings. Set initial wait time to 2 seconds, multiply by 1.5 each attempt, and retry up to 3 times before failing. For custom scripts, use libraries like tenacity in Python.

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=2))
def call_parspec_api(image_url, components):
    response = requests.post(
        'https://api.parspec-ai.com/v1/extract-specifications',
        headers={'Authorization': f'Bearer {PARSPEC_API_KEY}'},
        json={
            'image_url': image_url,
            'detected_components': components,
            'extract_materials': True
        }
    )
    return response.json()

Watch for rate limits. Accio, ParSpec, and PDnob all have rate limits. Most allow 100 requests per minute on standard plans. If you're processing 1,000 images, spread the load over 10 minutes rather than blasting all at once. In n8n, use the "Delay" node to add a 1-second pause between requests. For Python scripts, use time.sleep(1) or the ratelimit library.

Validate before storing. Add a sanity check node that ensures PDnob returned at least one component and all required fields are present. If validation fails, log the error and send a notification rather than storing garbage data.

{
  "if": "{{ $node['PDnob'].json.bill_of_materials.length > 0 }}",
  "then": "store results",
  "else": "send slack alert with error details"
}

Track confidence scores through the pipeline. Accio and ParSpec both return confidence values. Store these alongside your BOM. If confidence drops below a threshold (e.g., 0.75), flag the result for human review. This catches cases where the image is blurry or the product is ambiguous.

Cache results to save cost. If the same product image is processed multiple times, query your database first before calling Accio again. Use an image hash (MD5 or SHA256 of the image file) as the cache key. This is especially valuable if you're reprocessing batches or have recurring SKUs.

Cost Breakdown

ToolPlan NeededMonthly CostNotes
Accio AIStandard API£45–150Pay-as-you-go: £0.05–0.15 per image. 1,000 images = £50–150
ParSpec AIProfessional£80–200£0.08–0.20 per specification extraction. Includes material validation
PDnob Image Translator AIPlus£30–60£0.03–0.06 per BOM generation. CSV export included
n8n (self-hosted)Infrastructure only£20–100Small VPS (DigitalOcean, Linode, or AWS). No per-execution fees
n8n (cloud)Standard Cloud£40/monthIncludes 1,000 workflows and higher API limits; cheaper than per-execution alternatives at scale
MakeStandard£19/month10,000 operations included; additional ops at £0.006 each. Good for under 10,000 images/month
ZapierProfessional£50/month100,000 tasks included; overages at £0.25 per 100 tasks. Expensive for high volume

Total estimated monthly cost for 1,000 images on n8n (self-hosted): £45 (Accio) + £80 (ParSpec) + £30 (PDnob) + £20 (infrastructure) = £175. On Zapier, the same workflow would cost approximately £300 per month due to task-based pricing.

For higher volumes (5,000+ images), self-hosted n8n becomes significantly more economical. The infrastructure cost stays flat whilst API costs scale linearly, but your cost per image drops considerably.