LLMtary

Local LLM Red-Teaming Tool

Free plan available
No credit card

What is LLMtary?

LLMtary is a red-teaming tool designed to test and identify vulnerabilities in large language models running locally on your machine. Red-teaming involves simulating adversarial attacks and edge cases to uncover weaknesses, biases, and unsafe behaviours in AI systems before deployment. The tool is aimed at researchers, developers, and security teams who want to evaluate their locally hosted LLMs without sending data to external services. By keeping everything on your own infrastructure, you maintain privacy and control over sensitive model testing. LLMtary provides a structured approach to probing model responses, documenting failures, and iterating on safety improvements.

Key features

Local model testing

Run red-teaming exercises against LLMs hosted on your own hardware

Adversarial prompt generation

Create and execute test prompts designed to expose model weaknesses

Response analysis

Evaluate model outputs for harmful content, inconsistencies, or undesired behaviour

Result logging

Document test cases and outcomes for audit trails and improvement tracking

Privacy-focused

All testing remains on your infrastructure; no data leaves your environment

Pros & cons

Advantages

Complete data privacy; no external API calls or cloud data transmission during testing
Cost-effective for ongoing security testing without per-request fees
Helps identify safety gaps before deploying models to production or end users
Suitable for iterative development workflows where frequent testing is needed

Limitations

Requires sufficient local compute resources to run LLMs, which can be expensive or time-consuming
Limited to testing models you can host locally; not suitable for evaluating commercial closed-source models
Effectiveness depends on the quality and breadth of test prompts you create or configure