What is AudioCraft?
Developed by Meta AI, AudioCraft simplifies the overall design of generative models for audio. It provides a comprehensive solution for music, sound effects, and compression after training on raw audio signals. The framework includes MusicGen and AudioGen, which consist of a single autoregressive Language Model (LM) operating over streams of compressed discrete music representation (tokens).
AudioCraft leverages the EnCodec neural audio codec. This codec learns discrete audio tokens from the raw waveform, mapping the audio signal to parallel streams of discrete tokens. A single autoregressive language model then recursively models these audio tokens. Finally, generated tokens are fed back to the EnCodec decoder to reconstruct the output waveform. Different conditioning models, like pretrained text encoders, are used for text-to-audio control.
Features
- MusicGen: Produces diverse and long music samples from user-provided text inputs.
- AudioGen: Generates audio from environmental sounds based on text inputs.
- EnCodec: Neural audio codec that learns discrete audio tokens from raw waveforms.
- Autoregressive Language Model (LM): Recursively models audio tokens from EnCodec for efficient audio sequence modeling.
- Token Interleaving Pattern: Models audio sequences while capturing long-term dependencies to generate high-quality audio.
Use Cases
- Text-to-music generation
- Text-to-sound generation
- Audio compression
- Audio research
Related Queries
Helpful for people in the following professions
AudioCraft Uptime Monitor
Average Uptime
99.95%
Average Response Time
189.97 ms
Featured Tools
Join Our Newsletter
Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.