Build an AI Research Assistant Stack

Research is slow when done manually. You find a topic, search multiple sources, read articles, take notes, organise findings, then synthesise everything into a coherent summary. Each step requires a human hand. Even worse, you often repeat the same searches across different tools, duplicating effort and introducing inconsistencies. For more on this, see Academic research synthesis and citation-ready literature....

What if you could set up an automated research assistant that monitors your interests, collects relevant information, analyses it, and delivers structured findings without touching a keyboard? The good news is you can build this today using off-the-shelf AI tools and an orchestration layer.

This post walks you through creating a complete AI research stack. We'll use a combination of content discovery tools, language models, and storage services, wired together so information flows automatically from source to analysis to organised output. By the end, you'll have a system that runs continuously, improving your research without requiring manual intervention.......

The Automated Workflow

The research assistant stack works in four stages: discovery, retrieval, analysis, and storage. Let's build each stage and then connect them.

Stage 1: Content Discovery

Your workflow starts by identifying new relevant content. You have two main options here: RSS feeds or API-based search.

For RSS, we'll use a feed aggregator like Feedly or a simple RSS polling trigger in your orchestration tool. For broader discovery, you might use a service like Google Alerts or Bing News Search, which can be queried via API.

Here's what a discovery trigger looks like in n8n. It polls an RSS feed every 4 hours:


{
  "nodeType": "n8n-nodes-base.rss",
  "parameters": {
    "url": "https://feeds.example.com/research-topic",
    "pollInterval": 4
  }
}

Alternatively, if you're using the Bing News Search API:

curl -X GET "https://api.bing.microsoft.com/v7.0/news/search?q=machine+learning&count=10" \
  -H "Ocp-Apim-Subscription-Key: YOUR_API_KEY"

The response gives you headlines, URLs, and publication dates. Store these results as a list for the next stage.

Stage 2: Content Retrieval and Extraction

Not every headline deserves a full read. You need to fetch the actual article content and extract the text. Services like Mercury Parser, Readability, or ScrapingBee handle this well.

Here's a Mercury Parser request:

curl -X GET "https://mercury.postlight.com/api/v3/parse" \
  -H "x-api-key: YOUR_API_KEY" \
  -d "url=https://example.com/article"

The response includes the cleaned article text, title, author, and publication date:

{
  "title": "Recent Advances in Neural Networks",
  "author": "Jane Smith",
  "content": "Neural networks have evolved significantly...",
  "date_published": "2024-01-15",
  "url": "https://example.com/article"
}

In your orchestration tool (we'll use Zapier for this example as it's user-friendly), create a loop that takes each URL from Stage 1, calls Mercury Parser, and collects the results. Zapier's "Looping" feature handles this:


Trigger: New RSS item detected
│
└─→ Loop through each URL
    │
    ├─→ Call Mercury Parser API
    ├─→ Extract: title, content, author, date
    └─→ Store result in array

Stage 3: AI Analysis

This is where the actual research work happens. You'll send the extracted content to Claude or another language model for structured analysis.

Create a prompt that instructs the AI to extract key findings, methodology, conclusions, and relevance to your research area. Use the Anthropic API directly:

curl -X POST https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 1500,
    "messages": [
      {
        "role": "user",
        "content": "Analyse this research article and provide: 1) Main findings, 2) Methods used, 3) Key limitations, 4) Relevance to AI safety (rate 1-10). Article: ' + articleContent + '"
      }
    ]
  }'

The response will be structured text containing the analysis. Parse this response and extract each field (findings, methods, limitations, relevance score) into separate variables. This structured data becomes much more useful for later filtering and synthesis.

In n8n, use the "HTTP Request" node to call the API, then a "Function" node to parse the response:

// n8n Function node
const analysisText = $input.all()[0].json.content[0].text;
const lines = analysisText.split('\n');

return {
  findings: extractSection(lines, 'Main findings'),
  methods: extractSection(lines, 'Methods'),
  limitations: extractSection(lines, 'Key limitations'),
  relevanceScore: extractScore(lines, 'Relevance')
};

Stage 4: Storage and Organisation

The analysed research needs to live somewhere searchable and referenceable. A database like Airtable or Notion works well, but so does a simple Google Sheet with better formatting control.

Use Zapier's built-in Airtable integration or an HTTP request to write records:

curl -X POST https://api.airtable.com/v0/YOUR_BASE_ID/Research \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "records": [
      {
        "fields": {
          "Title": "Recent Advances in Neural Networks",
          "Author": "Jane Smith",
          "Source URL": "https://example.com/article",
          "Key Findings": "Networks now achieve 95% accuracy on benchmark tasks",
          "Methodology": "Transformer-based architecture with attention mechanisms",
          "Limitations": "Requires large training datasets; computational cost remains high",
          "Relevance Score": "9",
          "Date Added": "2024-01-15"
        }
      }
    ]
  }'

Each analysed article creates one Airtable record. Over time, you build a searchable database of curated research.

Putting It All Together

Here's the complete workflow in Zapier (simplest option for beginners):


1. Trigger: RSS feed (checks every 4 hours)
   Output: article URLs, titles, dates

2. Action: Looping by Zapier
   - For each article URL

3. Action: HTTP by Zapier
   - Call Mercury Parser API
   - Extract article content

4. Action: HTTP by Zapier
   - Call Claude API
   - Request structured analysis

5. Action: Airtable
   - Create record with all extracted data
   - Fields: title, content, findings, methods, limitations, relevance

If you prefer more control, n8n offers identical functionality with a visual interface and more customisation options. Here's a rough n8n graph:


RSS Feed Trigger
│
├─→ Loop through items
│   │
│   ├─→ Mercury Parser HTTP Request
│   │   │
│   │   ├─→ Parse response (Function node)
│   │   │
│   │   ├─→ Claude API HTTP Request
│   │   │
│   │   ├─→ Parse analysis (Function node)
│   │   │
│   │   └─→ Airtable Create Record
│   │
│   └─→ (next iteration)
│
└─→ End

For even more sophistication, you can add filtering logic. After the Claude analysis, evaluate the relevance score. If it's below 5, discard the record. If it's above 7, also send a notification via email or Slack:

// n8n conditional logic
if ($input.all()[0].json.relevanceScore >= 7) {
  return [{send_notification: true}];
} else if ($input.all()[0].json.relevanceScore >= 5) {
  return [{send_notification: false}];
} else {
  return [{skip_record: true}];
}

The Manual Alternative

If you want more control over each step, you can keep certain stages manual. For example, you might automate discovery and retrieval (Stages 1-2) but manually write your analysis in Notion or Obsidian. This hybrid approach saves time on the tedious parts whilst preserving your critical thinking.

Another approach: run the full automated stack weekly and spend 30 minutes on Friday reviewing the generated records, refining relevance scores, and identifying patterns. The automation handles breadth; human review adds depth.

You might also choose different tools for analysis. Instead of Claude, you could manually read articles and use a simple form (Google Form, Typeform) to submit your analysis into the same Airtable base. The workflow infrastructure stays the same; you just replace the API call with a human input step.

Pro Tips

Error Handling and Retries

Not every article will parse cleanly. Mercury Parser might return incomplete text. Claude might time out. Build in retry logic, especially for API calls.

In n8n, use the "Retry" option on HTTP nodes. Set it to 3 retries with exponential backoff. For articles that fail parsing, send them to a manual review queue in Airtable rather than breaking the entire workflow:

// n8n error handler
if ($input.all()[0].json.error) {
  return [{
    status: 'failed_parse',
    url: $input.all()[0].json.url,
    error: $input.all()[0].json.error,
    manual_review: true
  }];
}

Rate Limiting and Costs

Orchestration tools run workflows quickly. If you loop through 50 articles, you'll make 50 Claude API calls. At current pricing, that's roughly £0.75 for input processing alone. Monitor your execution logs in Zapier or n8n to understand your actual costs.

Implement a daily cap: stop processing after, say, 20 articles per day. Use Zapier's "Only continue if" condition or n8n's built-in throttling.

Relevance Scoring

After a few weeks, you'll have dozens of records. Add a secondary analysis step that re-evaluates relevance scores against your actual use cases. Ask Claude: "On a scale of 1-10, how relevant is this to recent AI safety debates?" versus your initial generic relevance score.

Duplicate Detection

As your research base grows, you'll sometimes ingest the same article via different RSS feeds. Add a deduplication step using Airtable's formula fields or a Function node in n8n that hashes the article URL and checks for duplicates before creating a record:

const crypto = require('crypto');
const urlHash = crypto.createHash('md5').update(articleUrl).digest('hex');
// Check Airtable for existing hash before creating new record

Database Maintenance

Airtable and Notion can become unwieldy with thousands of records. Quarterly, archive old records (older than 6 months) into a separate table. This keeps your working database lean and searchable.

Create a second workflow that runs monthly: filter records where "Date Added" is older than 6 months, move them to an archive table, then delete from the main table.

Cost Breakdown

Tool	Plan Needed	Monthly Cost	Notes
Zapier	Starter (tasks-based)	£19–£49	750–5,000 tasks/month; 1 task = 1 article retrieval + analysis
n8n	Self-hosted (free) or Cloud	£0–£60	Cloud plan recommended for reliability; self-hosted requires server
Mercury Parser	Paid tier	£5–£50	10,000–500,000 parses/month; optional if using built-in scraping
Claude (Anthropic)	Pay-as-you-go	£10–£40	Roughly £0.015 per article for analysis; depends on article length
Bing News Search API	Paid tier	£0–£50	Up to 1,000 queries/month free; then £3 per 1,000 queries
RSS feed service	Free or paid	£0–£5	Feedly free tier covers most needs
Airtable	Pro plan	£10–£20	100,000+ records supported; API access included
Total estimated range		£44–£234	Highly dependent on article volume and analysis frequency

A minimal setup (n8n self-hosted, free RSS, Claude, Airtable) costs roughly £20–£30 monthly. Adding Zapier (if you prefer the interface) doubles the cost. At high volume (1,000+ articles monthly), Claude costs dominate the budget.

Conclusion

Building an AI research assistant stack removes the friction from continuous learning. You define your interests once, set the orchestration in motion, and check back weekly to review curated findings. The automation handles discovery, retrieval, initial analysis, and organisation; you focus on synthesis and decision-making.

Start small: pick one research topic, one RSS feed, and build Stages 1–4 over a weekend. Once it's running, expand to multiple feeds or add secondary analysis steps. The modular nature of these tools means you can swap components (different parser, different AI model, different database) without rebuilding the entire workflow.

Build an AI Research Assistant Stack

The Automated Workflow

The Manual Alternative

Pro Tips

Cost Breakdown

Conclusion

Tool Pipeline Overview

More Recipes

Academic paper digesting pipeline for research synthesis

AI cost monitoring dashboard for development team spending

Competitive market intelligence dashboard from pricing and product data