Exploring Stability AI’s Advanced Stable Audio Model And Copyright Concerns In AI-Generated Music.

Stability AI has introduced an updated version of its Stable Audio model, enabling users to import their own audio samples and modify them using prompts with Stable Audio 2.0. The initial iteration of this AI model was unveiled in September 2023 but was limited to processing 90-second outputs. The new iteration, however, extends the output capacity to three minutes, allowing for the creation of complete sound clips. Nonetheless, all user-uploaded audio must be free of copyright.

According to Stability AI, this latest genAI output is more adept at crafting music-like compositions, complete with distinct intros, progressions, and outros. Added functionalities include the ability to adjust prompt intensity and the degree to which the AI adheres to the prompt or modifies the uploaded audio.

While this upgraded generative AI model permits the generation of three-minute clips, its actual value remains questionable. Major players like Google and Meta have also ventured into audio generation that loosely follows textual prompts, resulting in outputs vaguely resembling the input words. This analogy likens the experience to listening to music while inebriated through a door pressed against a nightclub playing suggested prompts.

The main criticism lies in the lack of intentional musical composition, leading AI-generated music to sound more like a discordant mix of sounds rather than a structured song. Stability AI trained its Stable Audio model using data from AudioSparx, comprising over 800,000 audio files. However, the company faced backlash for including works from AudioSparx artists who hadn’t consented to their inclusion in Stability’s training data.

Stability AI’s stance on the ‘fair use’ of copyrighted material for training generative AI models led to dissent within the company, with VP of Audio Ed Newton-Rex stepping down due to disagreement over this approach.

Scroll to Top