Music production pipeline from lyric sheet to mastered track with vocal synthesis

You've written six songs.

The lyrics are strong, the chord progressions feel right, and you know exactly what the final mix should sound like. But the path to a polished, radio-ready track involves juggling multiple freelancers: a session singer, a beat producer, a mixing engineer, and a mastering engineer. Each handoff costs time and money. A professional vocalist charges £150–300 per song. Mastering runs another £75–200. Scale that across an album, and you're looking at £1,500–3,000 just for outside help, plus weeks of back-and-forth revisions. Independent artists and small labels face a genuine constraint: quality production pipelines are expensive because they require human expertise at every stage. You either accept lower quality, burn through savings, or slow down your release schedule waiting for each specialist to finish their part. This Alchemy workflow automates the journey from lyric sheet to mastered track using four AI-driven tools wired together with minimal manual intervention. A trigger file on your drive starts the process. Within hours, you'll have a finished track with a synthesised vocal performance, royalty-free instrumentation, and professional mastering applied. ---

The Automated Workflow

Architecture Overview

We'll use n8n as the orchestration layer because it handles file triggers smoothly, supports multiple API integrations, and lets you retry failed steps without rerunning the entire workflow.

The pipeline moves data through Bronze (for instrumentation), ElevenLabs (for vocal synthesis), Landr (for mastering), and a final storage step that delivers your finished track. Data flow is linear: trigger event → retrieve lyric sheet → generate instrumental in Bronze → synthesise vocals in ElevenLabs → mix in Landr → save to cloud storage.

Step 1:

Trigger and Data Retrieval

Your n8n workflow watches a folder in Google Drive or Dropbox. When you upload a file named song_title.txt containing lyrics and BPM metadata, the workflow triggers.

json
{ "lyrics": "Verse 1: [lyrics here]...", "bpm": 95, "genre": "indie-pop", "vocal_description": "warm, breathy, female vocal, confident delivery"
}

Configure the File Trigger node to monitor your designated folder. n8n will poll every 5 minutes by default, which is appropriate for this use case.

Step 2:

Generate Instrumental with Bronze

Bronze uses AI to generate copyright-free background music and instrumental arrangements. You'll send Bronze the BPM, genre, and duration (inferred from lyric length or specified in your metadata).

POST https://api.bronze.io/v1/generate
Content-Type: application/json
Authorization: Bearer YOUR_BRONZE_API_KEY { "bpm": 95, "genre": "indie-pop", "duration_seconds": 240, "style": "indie-pop with folk undertones", "instrumentation": ["guitar", "drums", "bass", "light synth"], "mood": "uplifting, reflective"
}

Bronze returns a URL to a WAV file. Store this URL in a variable called instrumental_url. The response typically includes metadata about the generated track.

Step 3:

Synthesise Vocals with ElevenLabs

ElevenLabs generates speech and singing voices from text. You'll need to extract the vocal lines from your lyric sheet. For this workflow, assume the full lyrics are the vocal content.

POST https://api.elevenlabs.io/v1/text-to-speech
Content-Type: application/json
xi-api-key: YOUR_ELEVENLABS_API_KEY { "text": "[full lyric content from step 1]", "voice_id": "21m00Tcm4TlvDq8ikWAM", "model_id": "eleven_turbo_v2_5", "voice_settings": { "stability": 0.65, "similarity_boost": 0.75 }
}

The voice_id above points to a female voice; browse ElevenLabs' voice library to find one matching your vocal_description. ElevenLabs returns an audio stream. Configure the response to save directly to a buffer or temporary storage.

Step 4:

Mix and Combine Tracks

This is where the workflow branches slightly. You have two options: Option A: Use Landr's mixing feature if you upload both instrumental and vocal separately. Landr's API accepts audio files and applies intelligent mixing. Option B: Combine instrumental and vocal locally using ffmpeg within n8n before sending to Landr. This gives you more control over levels. We'll go with Option B for predictability. Add a Function node to your n8n workflow:

bash
ffmpeg -i instrumental.wav -i vocal.wav -filter_complex "[0]volume=0.85[inst];[1]volume=1.0[voc];[inst][voc]amix=inputs=2:duration=longest[out]" -map "[out]" -c:a aac -b:a 192k mixed.aac

Adjust the volume filters based on how prominent you want the vocal relative to the instrumental. The example keeps the instrumental at 85% and vocal at 100%. If you don't have ffmpeg available in your n8n environment, use Landr's mixing API directly and let Landr balance the tracks:

POST https://api.landr.com/v1/mixing/mix
Content-Type: application/json
Authorization: Bearer YOUR_LANDR_API_KEY { "instrumental_url": "https://storage.example.com/instrumental.wav", "vocal_url": "https://storage.example.com/vocal.aac", "mixing_style": "indie-pop", "vocal_loudness_target": -14
}

Step 5:

Apply Mastering with Landr

Landr's mastering engine accepts audio files and returns polished, loudness-normalised tracks ready for distribution. Send the mixed audio:

POST https://api.landr.com/v1/mastering/master
Content-Type: application/json
Authorization: Bearer YOUR_LANDR_API_KEY { "audio_url": "https://storage.example.com/mixed.aac", "format": "wav", "loudness_target": -14, "genre": "indie-pop"
}

Landr processes asynchronously and returns a callback URL or webhook. Configure your n8n workflow to wait for the mastering job to complete using a polling interval or webhook listener. Landr typically finishes within 2–5 minutes.

Step 6:

Final Delivery

Once mastering is complete, retrieve the final WAV file and save it to your cloud storage (Google Drive, Dropbox, or S3). Use an HTTP node to download the file, then a write file node to store it locally or push it to cloud storage. Name the output file using your song title and timestamp:

song_title_YYYY-MM-DD_HH-MM-SS_MASTERED.wav

Send yourself an email notification with a download link or playback preview. This confirms the workflow completed successfully and allows you to download the track immediately. ---

The Manual Alternative

If you want more control over artistic decisions, skip the full automation and use each tool independently.

Upload your lyric sheet to Bronze's web interface and choose instrumentation manually. Listen to the generated instrumental, adjust parameters if needed, and export. Then manually upload instrumental and lyrics to ElevenLabs, pick your voice, tweak timing and pitch, and export the vocal. Combine both files in your DAW of choice, mix to taste, export, and upload to Landr for mastering. This approach takes 1–2 hours per song but gives you veto power at each stage. ---

Pro Tips

Rate Limits and Backoff Strategy

ElevenLabs enforces rate limits based on your subscription tier.

Free accounts are capped at 10,000 characters per month; paid tiers offer higher limits. In n8n, configure exponential backoff when calling ElevenLabs. If a request fails with a 429 status code, wait 30 seconds, then retry up to three times before abandoning the workflow.

javascript
// Pseudo-code for backoff logic in n8n Function node
let retries = 0;
const maxRetries = 3;
while (retries < maxRetries) { try { const response = await makeAPICall(); return response; } catch (error) { if (error.status === 429) { retries++; await sleep(30000 * Math.pow(2, retries)); } else { throw error; } }
}

Cost Optimisation

Run your workflow during off-peak hours. Many AI services, particularly Landr, process jobs faster and more cheaply during low-traffic periods (late evening or early morning in your timezone). Schedule n8n to trigger workflows at 2 AM rather than during business hours.

Voice Selection and Consistency

ElevenLabs offers multiple voice options. For an album, pick one voice and stick with it. Store the voice_id as a variable in your n8n workflow so you don't need to reconfigure each song. This creates sonic cohesion across your release.

Error Handling for File Size

Some tools enforce file size limits. Bronze-generated instrumentals are typically 5–20 MB depending on duration and quality. ElevenLabs vocal synthesis can produce 10–50 MB files depending on lyric length. Before sending to Landr, verify file sizes. If a file exceeds Landr's 100 MB limit, compress it using ffmpeg.

Quality Testing with Short Samples

Before running a full song through the pipeline, test with a 30-second sample. This validates that your API keys work, file formats are compatible, and output quality meets your standards. Once you're confident, scale to full-length tracks. ---

Cost Breakdown

Tool	Plan Needed	Monthly Cost	Notes
n8n	Cloud Pro	£20	Handles automation logic; supports file triggers and multiple API calls. Self-hosted option available for £0 if you run your own server.
Bronze	Creator	£30–50	Generates copyright-free instrumentals. Per-track pricing also available.
ElevenLabs	Starter	£11	100,000 characters per month. Scale to Pro (£99) if producing 10+ songs monthly.
Landr	Advanced	£12.99	Includes mastering and mixing. Lifetime cost savings if you pay annually.
Cloud Storage	Google Drive or Dropbox Free	£0	Free tier sufficient for personal music library. Upgrade if storing 50+ full albums.
Total		£74–100	One-time setup; scales only with increased output.