What is LLMtary?

LLMtary is a red-teaming tool designed to test and identify vulnerabilities in large language models running locally on your machine. Red-teaming involves simulating adversarial attacks and edge cases to uncover weaknesses, biases, and unsafe behaviours in AI systems before deployment. The tool is aimed at researchers, developers, and security teams who want to evaluate their locally hosted LLMs without sending data to external services. By keeping everything on your own infrastructure, you maintain privacy and control over sensitive model testing. LLMtary provides a structured approach to probing model responses, documenting failures, and iterating on safety improvements.

Key Features

Local model testing

Run red-teaming exercises against LLMs hosted on your own hardware

Adversarial prompt generation

Create and execute test prompts designed to expose model weaknesses

Response analysis

Evaluate model outputs for harmful content, inconsistencies, or undesired behaviour

Result logging

Document test cases and outcomes for audit trails and improvement tracking

Privacy-focused

All testing remains on your infrastructure; no data leaves your environment

Pros & Cons

Advantages

  • Complete data privacy; no external API calls or cloud data transmission during testing
  • Cost-effective for ongoing security testing without per-request fees
  • Helps identify safety gaps before deploying models to production or end users
  • Suitable for iterative development workflows where frequent testing is needed

Limitations

  • Requires sufficient local compute resources to run LLMs, which can be expensive or time-consuming
  • Limited to testing models you can host locally; not suitable for evaluating commercial closed-source models
  • Effectiveness depends on the quality and breadth of test prompts you create or configure

Use Cases

Safety validation before deploying a fine-tuned LLM to a production environment

Academic research into model robustness and adversarial resilience

Internal security audits of proprietary language models

Iterative model improvement by identifying and addressing failure modes

Compliance documentation for regulated industries requiring model evaluation records