Cebra

Cebra

CEBRA is a library designed to estimate Consistent EmBeddings of high-dimensional Recordings utilizing Auxiliary variables. By leveraging self-supervised learning algorithms implemented with PyTorch,

Open SourceData & AnalyticsCustomer SupportDesignPython library (cross-platform: macOS, Windows, Linux)
Cebra screenshot

What is Cebra?

Cebra is a Python library that compresses high-dimensional time series data into lower-dimensional representations, making it easier to spot patterns and structures. It uses self-supervised learning built on PyTorch, which means it can learn from unlabelled data alongside auxiliary information like behavioural measurements or experimental conditions. The tool is designed primarily for neuroscience and biology researchers who work with large datasets combining neural recordings and behaviour. Cebra integrates with existing Python analysis workflows and helps reveal relationships between neural activity and observable behaviour that might otherwise remain hidden in the raw data.

Key Features

Self-supervised embedding

learns representations from unlabelled high-dimensional recordings using auxiliary variables as guidance

Time series compression

reduces complex neural or sensor data to interpretable lower-dimensional spaces

Behaviour and neural data integration

simultaneously analyse neural recordings with behavioural measurements

PyTorch implementation

built on a standard deep learning framework, allowing customisation and extension

Library integration

works alongside popular Python data analysis tools like NumPy, Pandas, and scikit-learn

Open source

Apache 2.0 licensed with active community development

Pros & Cons

Advantages

  • Specifically designed for biology and neuroscience workflows; handles the types of data these fields produces
  • Self-supervised approach reduces the need for manually labelled training data
  • Open source means no licensing costs and full access to the code for modification or inspection
  • Actively maintained with an open contribution process

Limitations

  • Requires Python programming knowledge and familiarity with PyTorch; not a point-and-click tool
  • Limited to time series data; not suitable for image, text, or other data modalities
  • Users must provide appropriate auxiliary variables for the method to work effectively; performance depends on data quality and experimental design

Use Cases

Analysing large-scale neural recordings to find patterns correlated with specific behaviours

Compressing multi-electrode array data whilst preserving behaviourally relevant information

Comparing neural representations across different animals, conditions, or experimental sessions

Reducing dimensionality of video tracking data to identify movement patterns linked to neural activity

Preprocessing high-dimensional sensor data before applying downstream statistical or machine learning analyses