Model Database's logo Model Database
  • Models
  • Datasets
  • Spaces
  • Docs
  • Pricing

  • Log In
  • Sign Up
microsoft 's Collections
SpeechT5

SpeechT5

updated 1 day ago

The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks.


  • SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

    Paper • 2110.07205 • Published Oct 14, 2021 • 1

  • microsoft/speecht5_tts

    Text-to-Speech • Updated 26 days ago • 40.3k • 251

    Note Text-to-speech version of SpeechT5


  • Running ont4
    181
    👩‍🎤

    SpeechT5 Speech Synthesis Demo


  • microsoft/speecht5_vc

    Audio-to-Audio • Updated Mar 22 • 16.5k • 36

    Note Voice-conversion version of SpeechT5


  • 80
    👩‍🎤

    SpeechT5 Voice Conversion Demo


  • microsoft/speecht5_asr

    Automatic Speech Recognition • Updated Mar 22 • 2.13k • 17

    Note Automatic-speech-recognition version of SpeechT5


  • 32
    👩‍🎤

    SpeechT5 Speech Recognition Demo


  • microsoft/speecht5_hifigan

    Updated Feb 2 • 59.5k • 9

    Note SpeechT5 produces a spectrogram, this model converts it to a waveform

Company
© Model Database
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs