Create music production pipeline from lyrics to mastered track with AI vocals

You're sitting on a finished song. Lyrics written, melody locked in. But to get from voice memo to radio-ready track, you're facing three problems: you can't afford session musicians, a professional vocalist is out of budget, and mixing plus mastering costs more than your rent. The barrier to entry for independent music production has never been lower in terms of raw capability, yet the workflow friction remains surprisingly high. Most musicians cobble together five different tools, manually bouncing files between them, waiting for renders, fixing errors by hand. A song that should take two hours takes two days. What if the entire chain, from your lyrics through to a fully mastered track, ran without you touching a single file? What if you could submit lyrics at breakfast and wake up to a finished production? This is possible now. You'll wire together a lyric-to-mastered-audio pipeline using Bronze for composition and arrangement, ElevenLabs for convincing vocal synthesis, and Landr for professional mastering. The orchestration layer will be n8n, which gives you the flexibility to handle music file transfers that Zapier struggles with, and the cost efficiency beats Make for this workload.

The Automated Workflow

Start by setting up a dedicated Supabase or Firebase database to store your project metadata. This becomes the single source of truth for your pipeline. Every record will hold the song title, lyrics, genre, vocal style preference, and project status. Your workflow triggers when you create a new row in this database. Here's the sequence: 1. n8n polls your database for new songs in "pending" status.

Bronze receives the lyrics and generates a full instrumental arrangement (you'll specify the genre and BPM in the database record).
ElevenLabs converts the vocal line (extracted from Bronze or submitted separately) into synthesised audio using your chosen voice preset.
The instrumental and vocal stems are mixed to a single file.
Landr receives the mixed track and returns a mastered version.
The database record updates with download links and final status. Let's build this in n8n, which you can self-host or run on their cloud platform.

Setting up the database trigger

Create a table called songs with these columns:

id (uuid, primary key)
title (text)
lyrics (text)
genre (text, e.g. "indie pop", "lo-fi hip hop")
bpm (integer)
vocal_style (text, e.g. "breathy", "deep", "energetic")
status (text, enum: "pending", "arranging", "voicing", "mixing", "mastering", "complete")
bronze_arrangement_url (text, nullable)
elevenlab_vocal_url (text, nullable)
landr_master_url (text, nullable)
created_at (timestamp)
updated_at (timestamp)

In n8n, start with a Database node querying songs WHERE status = 'pending'. Set it to run every 5 minutes.

Step 1: Call Bronze's arrangement API

Bronze doesn't expose a public REST API yet, but some instances support webhook submissions. If you're using Bronze self-hosted or have API access through a partnership, you'll hit their arrangement endpoint:

POST https://api.bronze.example/v1/arrangements
Content-Type: application/json { "lyrics": "Verse 1: When the morning light breaks through...", "genre": "indie pop", "bpm": 96, "instrumentation": ["guitar", "drums", "bass", "synth"], "duration_seconds": 240, "webhook_url": "https://your-n8n-instance.com/webhook/bronze-complete"
}

Store the arrangement request ID in your database. Update status to "arranging". Bronze will call your webhook when the arrangement is ready. The webhook payload includes a download URL for the instrumental stem. Update your database record with this URL.

Step 2: Prepare vocal synthesis request

Once Bronze finishes, trigger ElevenLabs. ElevenLabs requires a voice ID (you pick from their preset voices or clone a custom voice) and the text to synthesise. The vocal line comes from either Bronze's lyrical breakdown or a separate field you populated. The ElevenLabs API endpoint is:

POST https://api.elevenlabs.io/v1/text-to-speech/{voice_id}
Content-Type: application/json { "text": "When the morning light breaks through the window pane", "model_id": "eleven_monolingual_v1", "voice_settings": { "stability": 0.75, "similarity_boost": 0.85 }
}

Get your API key from ElevenLabs and store it in n8n's credential manager. In your n8n workflow, use an HTTP Request node:

Method: POST
URL: https://api.elevenlabs.io/v1/text-to-speech/{{voiceId}}
Headers: xi-api-key: {{elevenLabsApiKey}} Content-Type: application/json Body:
{ "text": "{{songLyrics}}", "model_id": "eleven_monolingual_v1", "voice_settings": { "stability": 0.75, "similarity_boost": 0.85 }
}

The response includes a binary audio stream. Use n8n's Write Binary File node to save this as a WAV file, then upload it to your cloud storage (S3, Google Cloud Storage, etc). Update the database with the URL.

Step 3: Mix instrumental and vocal

This step requires some care. You need to download both stems, align them, and apply basic mixing. FFmpeg is your friend here. Use n8n's Execute Command node to run:

bash
ffmpeg -i instrumental.wav -i vocal.wav \ -filter_complex "[0]volume=0.8[inst];[1]volume=1.0[vox];[inst][vox]amix=inputs=2:duration=longest[out]" \ -map "[out]" mixed.wav

The -filter_complex applies a slight volume reduction to the instrumental (0.8 = 80% volume) to let the vocal sit on top. Adjust these values based on your mix preference. Once mixed, upload the result to your storage bucket.

Step 4: Send to Landr for mastering

Landr's API accepts audio files and returns a mastered version. You'll need a Landr account with API access:

POST https://api.landr.com/v1/masters
Authorization: Bearer {{landrApiToken}}
Content-Type: application/json { "audio_url": "https://your-bucket.s3.amazonaws.com/mixed_123.wav", "genre": "indie pop", "loudness_target": -14.0, "format": "wav"
}

Landr processes this asynchronously. The response includes a master_id. Poll the status endpoint every 30 seconds until the master is ready:

GET https://api.landr.com/v1/masters/{{masterId}}
Authorization: Bearer {{landrApiToken}}

When status changes to "complete", the response includes a download URL for your mastered track.

Step 5: Update database and notify

Once Landr returns the mastered file, update your database record with the final URL and set status to "complete". Send an email notification (use n8n's Send Email node) with download links. Here's a simplified n8n workflow structure:

1. Supabase Database (poll for pending songs)
2. Bronze API (HTTP Request) → wait for webhook
3. ElevenLabs (HTTP Request) → save audio
4. Execute Command (FFmpeg mix)
5. Landr (HTTP Request) → poll status
6. Send Email (notify user)
7. Supabase Database (update status to complete)

Connect these with conditional routing: if any step fails, set status to "error" and send an alert.

The Manual Alternative

If you prefer control at each step, skip orchestration entirely. Use Bronze's web interface to arrange your song, manually download the instrumental stem. Go to ElevenLabs directly, paste your lyrics, select your voice, download the vocal. Import both into Audacity or Reaper for mixing. Export to WAV. Upload to Landr's web dashboard and download the master. This takes 45 minutes to an hour. It's viable if you're producing one or two songs per week. But if you're prolific, the pipeline saves hours.

Pro Tips

Manage rate limits carefully.

ElevenLabs allows 10,000 characters per month on the free tier.

A typical song is 500-800 characters, so that's 12-20 songs before hitting a wall. For sustained output, you'll need a paid plan. Landr queues masters during peak hours, sometimes taking 20-30 minutes. Don't set your polling interval to less than 30 seconds or you'll waste API calls.

Handle file storage efficiently.

Don't keep intermediate files in your database; store them in S3 or Google Cloud Storage and retain only URLs. Audio files are large (50-200 MB for uncompressed stems), and database queries slow down if you're storing blobs. Set an automated cleanup job to delete files older than 30 days.

Test with short clips first.

Before processing a full 3-minute song, test your pipeline with a 30-second demo. This catches configuration errors without burning through your ElevenLabs character limit or waiting for full-length Landr masters.

Add a quality gate.

Landr assigns a "target loudness" based on genre. Indie pop typically aims for -14 LUFS. If your mix is significantly below that before mastering, Landr will struggle. Monitor the mix's loudness (use ffmpeg-normalize or a loudness meter) before it hits Landr. Reject anything below -18 LUFS and notify you to remix.

Plan for Bronze delays.

Complex arrangements can take several minutes. If Bronze hasn't returned within 10 minutes, assume it's stuck and trigger a retry. Set up a dead-letter queue in n8n to catch failed Bronze requests and alert you.

Cost Breakdown

Tool	Plan Needed	Monthly Cost	Notes
Bronze	Self-hosted or Partner API	£0–£50	Varies by deployment; self-hosting is free but requires compute.
ElevenLabs	Starter Plan	£11	10,000 characters per month; upgrade to Creator (£99) for 100,000 characters if you're prolific.
Landr	Studio Plan	£9.99	Unlimited masters; includes mixing tools and distribution.
n8n	Cloud Pro	£40–£80	Or free if self-hosted; cloud plan handles moderate workflows.
Cloud Storage (S3)	Standard	£5–£20	Depends on data stored and bandwidth. Budget 50–100 MB per song.
Total		£65–£160	Fully automated, production-ready workflow.