• Home
  • About SpeechBrain
  • Contributing
  • Documentation
  • Tutorials
  • Benchmarks

SpeechBrain

Open-Source Conversational AI for Everyone

Get Started GitHub
🎉 SpeechBrain 1.0 is out. Check out what's new!

Key Features

Open, simple, flexible, well-documented, and with competitive performance.

Speech

SpeechBrain supports state-of-the-art technologies for speech recognition, enhancement, separation, text-to-speech, speaker recognition, speech-to-speech translation, spoken language understanding, and beyond.

Audio

SpeechBrain encompasses a wide range of audio technologies, including vocoding, audio augmentation, feature extraction, sound event detection, beamforming, and other multi-microphone signal processing capabilities.

Text

SpeechBrain offers user-friendly tools for training Language Models, supporting technologies ranging from basic n-gram LMs to modern Large Language Models. Our platform seamlessly integrates them into speech processing pipelines and facilitates the creation of customizable chatbots.

Technology

SpeechBrain leverages the most advanced deep learning technologies, including methods for self-supervised learning, continual learning, diffusion models, Bayesian deep learning, and interpretable neural networks.

Research & Development

SpeechBrain is engineered to accelerate the research and development of Conversational AI technologies. It comes with pre-built recipes for popular datasets. Extensive documentation and tutorials are available to support newcomers.

HuggingFace!

SpeechBrain offers pre-trained models with user-friendly interfaces, making tasks like transcription, speaker verification, speech enhancement, and source separation easier than ever.

Why SpeechBrain?

  • Easy to install
  • Easy to use
  • Easy to customize

Adapts to your needs.

You can install SpeechBrain via PyPI for quick access to its functionalities, or through a local install for accessing recipes and delving deeper into the toolkit.
Get Started Now

  # From PyPI
  pip install speechbrain

  # Local installation
  git clone https://github.com/speechbrain/speechbrain.git
  cd speechbrain
  pip install -r requirements.txt
  pip install --editable .
                    

A single command.

Each SpeechBrain recipe defines all hyperparameters into a single YAML file. The training process is then orchestrated by a Python script.
Get Started Now

  cd recipes/{dataset}/{task}/train

  # Train the model using the default recipe
  python train.py hparams/train.yaml

  # Train the model with a hyperparameter tweak
  python train.py hparams/train.yaml --learning_rate=0.1
                    

Built for research.

SpeechBrain is designed for research and development. Hence, flexibility, transparency, and replicability are core concepts to enhance our daily workflows. Users can easily define custom deep learning models, losses, training/evaluation loops, and input pipelines/transformations, and easily integrate into existing pipelines.
Get Started Now

  class ASR_Brain(sb.Brain):
    def compute_forward(self, batch, stage):

      # Compute features (mfcc, fbanks, etc.) on the fly
      features = self.hparams.compute_features(batch.wavs)

      # Improve robustness with pre-built augmentations
      features = self.hparams.augment(features)

      # Apply your custom model
      return self.modules.myCustomModel(features)
                      

Sponsors

Our new call for sponsors (2024) is now open.

nle
hf
baidu
ovh
lia

Previous Sponsors

Collaborators

About Us

SpeechBrain isn't a company or an association. It is an open-source toolkit and a community created by Dr. Mirco Ravanelli and co-created by Dr. Titouan Parcollet. We aim at making speech technologies more accessible for the community.

Copyright © All rights reserved

Opportunities

Thanks to our sponsors, we often recruit talented candidates to continue expanding the functionalities of SpeechBrain. Feel free to contact us at: speechbrainproject@gmail.com

Follow Us

Let us be social