Music production pipeline: lyrics to mastered track with AI vocals

Most musicians waste months on production busywork. You write lyrics, record vocals, apply effects, master the mix, and somewhere between iteration three and iteration twelve, you question whether this hobby is worth the time investment. The irony is that many of these steps, vocal synthesis, mastering, even foundational composition, can now be handled by AI tools working in tandem. The bottleneck isn't capability; it's knowing how to connect the dots. This workflow takes you from a lyric sheet and a rough melody to a professionally mastered track with AI-generated vocals, all without manual handoffs between tools. You'll use Bronze to generate the musical foundation, ElevenLabs to synthesise vocals that sound natural, and Landr to master the final mix. An orchestration layer ties them together so the output of one tool automatically feeds into the next. What might normally take three weeks can now run overnight. The result won't replace a human artist, but it will replace the grind of repetitive production work. For demo creation, soundtrack work, or rapid prototyping of song ideas, this approach saves serious time.

The Automated Workflow

You'll need an orchestration platform to glue these tools together. Make (formerly Integromat) is the best choice here because it handles long-running workflows well and has solid support for webhook-based delays. Zapier and n8n both work, but Make's conditional logic and file handling make it cleaner for this use case. Here's the flow: You submit a prompt describing your song (genre, mood, lyrics, desired structure) via a webhook. Bronze generates an instrumental or a base track. That output gets polled until it's ready. The audio file is retrieved and sent to ElevenLabs along with your lyrics to generate vocals. Once vocals are ready, the instrumental and vocal tracks are combined, then sent to Landr for mastering. The mastered file is returned to you via email or cloud storage.

Setting up the webhook trigger in Make

In Make, create a new scenario and add an HTTP webhook module as your trigger. Configure it to accept POST requests with a JSON payload like this:

json
{ "lyrics": "Verse one here...", "genre": "synthwave", "mood": "melancholic", "vocal_style": "alto", "email": "user@example.com"
}

Save the webhook URL and keep it safe; this is where you'll submit new song requests.

Step 1: Generate the instrumental with Bronze

Add a Bronze API call module. You'll need your Bronze API key in the request header.

POST https://api.bronze.ai/v1/generate
Headers: Authorization: Bearer YOUR_BRONZE_API_KEY Content-Type: application/json Body:
{ "prompt": "{{payload.genre}} instrumental track with {{payload.mood}} energy", "duration": 120, "style": "instrumental"
}

Bronze returns a job ID. Store this in a variable called bronze_job_id. The actual audio file won't be ready immediately, so you need a polling mechanism.

Step 2: Poll Bronze for completion

Add a repeat (loop) module that checks the job status every 10 seconds for up to 3 minutes. Use this endpoint:

GET https://api.bronze.ai/v1/jobs/{{bronze_job_id}}
Headers: Authorization: Bearer YOUR_BRONZE_API_KEY

When the status changes to completed, extract the audio file URL and store it in a variable called bronze_audio_url. If polling times out, send yourself an alert email.

Step 3: Prepare lyrics for ElevenLabs

ElevenLabs works best with clean, properly formatted text. Add a text formatter module that cleans up line breaks and adds natural pauses. You might want to split the lyrics into chunks if they're very long, since ElevenLabs has a character limit per request. Create a variable called formatted_lyrics with the cleaned-up text.

Step 4: Synthesise vocals with ElevenLabs

Add an ElevenLabs API module. You'll use the text-to-speech endpoint with a specific voice ID. First, choose your voice from ElevenLabs' voice library (or create a custom one trained on a reference recording). Store the voice ID in a Make variable.

POST https://api.elevenlabs.io/v1/text-to-speech/{{voice_id}}
Headers: xi-api-key: YOUR_ELEVENLABS_API_KEY Content-Type: application/json Body:
{ "text": "{{formatted_lyrics}}", "model_id": "eleven_monolingual_v1", "voice_settings": { "stability": 0.75, "similarity_boost": 0.85 }
}

This returns audio data. Store the file in a temporary location or as a base64 string in a Make variable called vocal_audio.

Step 5: Combine instrumental and vocals

This is where you need a helper. Make doesn't have native audio mixing, so you have two options: use FFmpeg via a webhook to a serverless function (like AWS Lambda or Google Cloud Functions), or use a dedicated audio API like Soundtrap or Descript's API. For simplicity, assume you have a small Lambda function that accepts the Bronze audio URL and vocal audio as inputs and returns a mixed file. Call it like this:

POST https://your-lambda-function-url/mix-audio
Headers: Content-Type: application/json Body:
{ "instrumental_url": "{{bronze_audio_url}}", "vocals_base64": "{{vocal_audio}}", "output_format": "wav"
}

Store the result in mixed_audio_url.

Step 6: Master with Landr

Landr's API accepts audio files and returns mastered versions. Submit your mixed audio:

POST https://www.landr.com/api/v1/upload
Headers: Authorization: Bearer YOUR_LANDR_API_KEY Content-Type: multipart/form-data Body:
{ "file": {{mixed_audio_url}}, "preset": "balanced", "stem_separation": false
}

Landr returns a mastering job ID. Again, polling is needed since mastering takes time (typically 5-15 minutes depending on queue).

GET https://www.landr.com/api/v1/mastering/{{landr_job_id}}
Headers: Authorization: Bearer YOUR_LANDR_API_KEY

When status is finished, extract the mastered_audio_url.

Step 7: Deliver the final track

Add a final module that sends the mastered audio file via email or uploads it to cloud storage (Google Drive, Dropbox, etc.). Make has native modules for both. Include a summary: the original lyrics, generation timestamps, and model versions used.

Email template:
Subject: Your track is ready!
Body: Your mastered track is attached. Generated {{now}} using Bronze, ElevenLabs Turbo v2.5, and Landr.
Attachment: {{mastered_audio_url}}

The Manual Alternative

If you prefer more creative control or want to tweak things mid-process, build the workflow in stages. Generate the Bronze instrumental, download it, and layer vocals manually in a DAW (Digital Audio Workstation) like Ableton or Logic Pro. Use ElevenLabs to generate vocals separately and adjust timing and effects by hand. Finally, upload the mixed file to Landr. This approach takes longer but gives you more flexibility to adjust tone, timing, and the balance between instrumental and vocals. It's worth doing this way if you're working on something where your artistic instinct matters more than speed.

Pro Tips

Handling API rate limits.

Bronze, ElevenLabs, and Landr all have rate limits.

Bronze typically allows 10 jobs per minute; ElevenLabs allows 100 requests per minute on free tier, more on paid. Add deliberate delays between API calls in Make using the "Sleep" module to avoid hitting limits. Space out requests by 500ms to 1 second.

Error handling and retries.

Add conditional logic to catch failures. If Bronze's job fails, send an alert and pause the workflow. If ElevenLabs times out, retry the request once before failing. In Make, use the "Error Handler" module to catch API errors and log them to a spreadsheet for debugging.

Cost optimisation.

Run this workflow during off-peak hours (make uses Landr's queue, which can be slower but cheaper at night). Batch multiple songs if possible; you'll hit fewer API calls than running them one by one. Store mastered files in cloud storage rather than email if they're large, since email attachments incur bandwidth costs.

Voice selection and consistency.

ElevenLabs' voice selection affects the final character of your track. Test two or three voice IDs before committing to a full workflow. Document which voice ID works best for each genre or mood.

Policing file sizes.

Landr accepts files up to 200MB. If your mixed audio is larger (unlikely, but possible with uncompressed WAV), compress it to MP3 at 320kbps before sending. This also reduces processing time.

Cost Breakdown

Tool	Plan Needed	Monthly Cost	Notes
Bronze	Pro	£15–£25	50–100 generations per month; depends on track length
ElevenLabs	Creator	£7	50,000 characters per month; enough for roughly 10 songs with vocals
Landr	Standard	£9	Unlimited masters; includes stem separation if needed
Make	Free/Pro	£0–£10	Free tier covers ~100 operations; Pro adds faster execution
AWS Lambda (mixing)	Pay-as-you-go	£0–£5	~£0.02 per song if using FFmpeg; can be free under 1 million requests per month
Total	,	£31–£49	Covers roughly 10 full productions per month