Phoenix

Open-source tool for ML observability that runs in your notebook environment, by Arize. Monitor and fine-tune LLM, CV, and tabular models.

Open SourceData & Analytics SDKs & Libraries IDEs & Editor Extensions Predictive Analytics & ML Debugging & Testing DevOps & CI/CD AI Model Deployment & Observability Database & Backend Spreadsheet & BI Developer Tools Code AI Monitoring ToolsWeb, API

Visit Phoenix

What is Phoenix?

Arize Phoenix is an open-source ML observability and LLM tracing platform designed for data scientists and ML engineers who need real-time visibility into their AI applications. It enables users to instrument, experiment, and optimise large language models, computer vision models, and tabular machine learning models directly within notebook environments like Jupyter. Phoenix provides a vendor-agnostic, framework-neutral approach to monitoring AI systems without vendor lock-in, making it accessible for teams of any size. The tool help end-to-end visibility from model inputs through inference outputs, allowing users to identify performance issues, debug model behaviour, and evaluate LLM responses before deployment.

Key Features

LLM Tracing and Instrumentation

Capture detailed traces of LLM calls and interactions to understand model behaviour and identify bottlenecks

Notebook-based Evaluation

Run evaluations directly in Jupyter notebooks for smooth integration into existing ML workflows

Multi-model Support

Monitor and optimise LLMs, computer vision models, and tabular machine learning models from a single platform

Real-time Monitoring

Track model performance and behaviour in real-time with interactive dashboards and visualizations

Framework Agnostic

Works with any ML framework or LLM provider without requiring vendor-specific implementations

Experiment Tracking

Compare model versions and experiments to identify the best performing configurations

Pros & Cons

Advantages

Completely open-source with no vendor lock-in, giving teams full control and transparency
Runs directly in notebook environments, reducing setup friction and fitting naturally into existing workflows
Framework-agnostic design supports diverse ML stacks and LLM providers
Purpose-built for LLM observability with specialise features beyond general ML monitoring
Active community and backing by Arize, a recognise leader in ML observability

Limitations

As an open-source tool, support and documentation may be less thorough than commercial alternatives
Requires self-hosting or management of infrastructure for production deployments at scale
May require more technical configuration compared to fully managed SaaS observability platforms

Use Cases

Debugging and tracing LLM application issues during development and testing phases

Evaluating and comparing different LLM models or prompts before production deployment

Monitoring model performance and detecting data drift in real-time for production systems

Optimizing computer vision and tabular model performance through detailed performance analysis

Educational use for learning about ML observability and model behaviour analysis