Back to Alchemy
Alchemy RecipeIntermediateworkflow

From raw footage to monetised YouTube: automated video editing and thumbnail generation

28 March 2026

Introduction You've just finished filming a 45-minute gaming session or product review.

The footage is gold, but now you're staring at hours of editing work: cutting it into clips, adding captions, generating a thumbnail, maybe recording a voiceover. By the time you're done, you've spent six hours in post-production and burned through the creative energy you could have spent filming tomorrow's content. This is where most creators get stuck. They treat post-production as a necessary evil rather than something that can be systematised. The result is a bottleneck: great content sits on your hard drive whilst you're glued to editing software. Meanwhile, the algorithm favours consistency and volume, not perfection. What if that entire pipeline, from raw footage to a thumbnail-ready upload, happened automatically? This workflow chains together five tools to transform bulk video into monetisable content with almost no hands-on editing. You record, you hit "go", and two hours later your clips are cut, voiced over, and ready to distribute.

The Automated Workflow This workflow assumes you've uploaded raw footage to a cloud folder (Google Drive, Dropbox, or similar).

The orchestration engine watches for new files, processes them through multiple AI services, and delivers finished clips ready for YouTube. We'll build this with n8n because it handles branching logic and file operations better than Zapier, and it's self-hosted or cloud-based depending on your needs.

Step 1: Monitor for new video files

Set up an n8n trigger that watches your cloud storage folder. When a new .mp4 or .mov file appears, the workflow starts.

Trigger: Cloud Storage Watch (Google Drive / Dropbox)
Condition: File extension = .mp4 or .mov
Action: Extract file path and filename

Step 2: Clip the long-form video

Clipwing's API accepts a video URL and a JSON instruction set telling it where to cut. You'll send your raw footage to Clipwing, which returns an array of shorter clips optimised for YouTube Shorts or TikTok.

POST https://api.clipwing.com/v1/clip { "source_video_url": "https://your-cdn.com/raw-footage.mp4", "clips": [ { "start_time": "00:00:15", "end_time": "00:00:45", "title": "Highlight Moment" }, { "start_time": "00:02:30", "end_time": "00:03:00", "title": "Key Insight" } ]
}

Clipwing returns clip URLs. Store these in a variable for the next step. If you want Clipwing to auto-detect highlights instead of specifying timecodes manually, you can pass your video through a Claude Opus 4.6 vision analysis first to identify scene breaks and moments of high energy, then feed those timestamps to Clipwing.

Step 3: Generate voiceovers (optional but recommended)

For each clip, you might want a spoken intro or explanation. Route clip metadata (title, duration) to ElevenLabs Turbo v2.5 with your chosen voice.

POST https://api.elevenlabs.io/v1/text-to-speech/{voice_id}/stream { "text": "In this moment, we saw a 300% improvement in performance.", "model_id": "eleven_turbo_v2_5", "voice_settings": { "stability": 0.5, "similarity_boost": 0.75 }
}

Save the returned audio file. You'll layer it into the final video next. Note: ElevenLabs charges per character, so pre-write scripts and keep them concise.

Step 4: Generate thumbnails

Feed clip titles and a screenshot from each clip (captured at the midpoint) to Nsketch AI's image generation endpoint. Request a YouTube thumbnail (1280x720) with bold text overlay and contrasting colours.

POST https://api.nsketch.ai/v1/generate { "prompt": "YouTube thumbnail: dramatic highlight from gaming video, bright red accent text saying 'INSANE PLAY', bold sans-serif font, screenshot base at centre", "style": "youtube_thumbnail", "dimensions": [1280, 720], "model": "flux_1_1_pro"
}

Nsketch returns a thumbnail URL. Download and tag it with the corresponding clip name so you know which thumbnail goes with which video later.

Step 5: Compose final videos

Now you have clips, voiceovers, and thumbnails. Use Vibevideo as your final composition layer. It acts as a unified interface to Runway Gen-3 Alpha and other video models, letting you add effects, transitions, and captions without switching between platforms.

POST https://api.vibevideo.ai/v1/compose { "base_video": "https://cdn.clipwing.com/clip-001.mp4", "voiceover": "https://cdn.elevenlabs.io/voiceover-001.mp3", "captions": { "enabled": true, "auto_generate": true, "style": "modern_white" }, "transitions": "fade", "output_format": "mp4_1080p"
}

Vibevideo returns a final video. This is your ready-to-upload asset.

Step 6: Organise and notify

In n8n, create a final node that moves all finished assets (video, thumbnail, captions file) into a "Ready for Upload" folder, tags them with metadata (clip number, duration, topic), and sends you a webhook notification or Slack message.

Action: Move files to gs://your-bucket/ready-for-upload/
Notification: POST to Slack webhook
{ "text": "✅ Video batch complete: 8 clips, 4 thumbnails ready.", "attachments": [ { "title": "Clip 1 - 45 seconds", "image_url": "https://cdn.example.com/thumb-001.jpg" } ]
}

Putting it together in n8n Your workflow in n8n looks like this, branching after the Clipwing step:

Cloud Storage Watch ↓
Clipwing API (clip generation) ↓ ├─→ ElevenLabs Turbo v2.5 (voiceover for each clip) ├─→ Nsketch AI (thumbnail for each clip) └─→ Wait for both to complete ↓ Vibevideo (compose final video) ↓ Move to ready folder + notify

Use n8n's "Loop" node to iterate through each clip in parallel, reducing overall execution time from 2 hours to 30-45 minutes. Set up error handling at each step: if Clipwing fails, the workflow pauses and notifies you with the error log rather than silently failing.

The Manual Alternative Not every creator wants fully automated post-production.

Some prefer to review and tweak intermediate steps. You can modify this workflow to pause at certain points and ask for your approval before proceeding. After Clipwing generates clips, send yourself a notification with a preview link and a choice: "Approve clips" or "Re-run with different settings". Only continue if you approve. Similarly, after Nsketch generates thumbnails, review them before they're attached to videos. This hybrid approach trades speed for control. It adds 15-30 minutes of your time but ensures no obvious mistakes slip through. Most creators find the sweet spot is automating the technical steps (voiceover rendering, video composition) whilst keeping creative decisions (which clips matter, thumbnail design) manual.

Pro Tips Rate limiting and queuing. Clipwing and ElevenLabs both enforce rate limits; if you're processing multiple videos, n8n's queue node prevents you from slamming their APIs.

Add a 2-second delay between API calls, or use n8n's built-in rate limiter. Cost-per-output tracking. Before you run a full batch, calculate the per-clip cost. ElevenLabs charges roughly 100 credits per 1000 characters. If you're writing 100-character intros for 50 clips, that's 5000 credits, or about £2. Multiply that by your daily uploads and you'll know whether this makes financial sense. Fallback for failed generation. If Nsketch fails to generate a thumbnail (rare, but happens), set a fallback: use Imagen 3 as a secondary option, or default to a text-based thumbnail template you've created manually. Don't let one failed API call halt the entire batch. Testing on a small subset. Run your first workflow on a 10-minute video, not a 45-minute one. You'll catch configuration errors fast, understand the actual execution time, and verify quality before scaling to full batches. Versioning and archiving. Save intermediate outputs (raw clips from Clipwing, first-pass thumbnails) in an archive folder, not just the final versions. If you want to re-edit or regenerate a thumbnail with different settings, you won't need to re-run Clipwing.

Cost Breakdown

ToolPlan NeededMonthly CostNotes
ClipwingPro£30100 clips per month included; overage at £0.10 per clip.
ElevenLabs Turbo v2.5Creator£20500,000 characters per month. Voiceovers for 50 clips (5000 chars each) cost ~£2.
Nsketch AIStarter£25100 image generations per month. At 1 thumbnail per clip, sufficient for 100 clips.
VibevideoProfessional£35Unlimited video compositions up to 4K; daily batch limit of 20 videos.
n8nCloud Pro£20Self-hosted is free but requires infrastructure; Cloud Pro removes that burden.
Total,£130Supports roughly 100 clips per month, or 15-20 long-form videos converted into short-form content.