Spotify has incorporated a "stunningly realistic" artificial DJ into its service, playing music based on your musical preferences and listening history.
Since people liked Spotify's new AI DJ feature, the company wants to use the technology behind it in more significant ways. It picks out personalised music and gives spoken commentary in an AI-generated voice that sounds real. But behind the scenes, the feature uses the latest AI technologies, large language models, and generative voice. These technologies are added to Spotify's investments in personalisation and Machine Learning.
Spotify thinks these new tools don't have to be limited to just one feature. That's why it is now trying other ways to use the technology.
How does it work?
The DJ knows you and your taste in music so well, from your music history on the app, of course, that it will scan the latest releases it thinks you'll like or take you back to that nostalgic playlist you had had on repeat last year. Listening has never felt so personal to every user before, Spotify's customisation engine will survey what you've already listened to.
Using OpenAI technology, AI can come up with ideas on its own. They gave this job to the music editors so they could tell the listener exciting things about the music, artists, or genres that the listener likes.
Spotify claims it utilised the knowledge and observations of its in-house music specialists to generate the music the DJ streams. With OpenAI's Generative AI technology, the DJ may tailor their commentary to the app's final consumers. And, in contrast to ChatGPT, which attempts to generate responses by distilling information from the broader Web, Spotify's relatively limited collection of musical expertise ensures that the DJ's comments are pertinent and accurate.
What do they say about it?
The music choices the DJ makes are based on what it already knows about a user's tastes and interests. It is similar to how personalised playlists like Discover Weekly and others were used to be made.
"It sounds so amazing because that was the goal of the Sonatic technology and the team we acquired. It is about the emotion in the voice," argues Ziad Sultan, head of personalisation at Spotify, in an interview with TechCrunch. "When you hear the AI DJ, you will detect the breathing pauses. You will hear the variations in intonation. One can perceive enthusiasm for particular musical genres," he says.
The technology
A natural-sounding AI voice is not new. Google surprised the world years ago with its own AI invention that sounded human. However, its deployment within Duplex sparked criticism, as the AI initially contacted businesses on behalf of the user without indicating that it was not a human. Given that Spotify's service is branded as an "AI DJ", a similar problem should not exist.
To make Spotify's AI voice sound genuine, Jernigan produced high-quality voice recordings in the studio while collaborating with speech technology experts. There, he was trained to read different sentences with varying emotions, subsequently fed into the AI model. Spotify did not disclose the duration of this procedure or its intricacies, citing the technology's evolution and referring to it as its "secret sauce".
"With that high-quality data that has a variety of permutations, [Jernigan] no longer has to say anything; it is now all AI-generated," explains Sultan of the created voice. Still, Jernigan will occasionally visit Spotify's writers' room to provide comments on how he would hear a lyric to maintain his input.
What to expect?
With this foundational technology, Spotify can expand into other AI, LLM, and generative AI-based fields. However, in terms of consumer goods, the corporation would still need to disclose what these categories may be (according to reports, a chatbot-like ChatGPT is among the solutions being tested. But, more is required regarding a launch, as this is only one of many experiments).
In addition, Spotify reported that, on days when users tuned in, they spent 25 per cent of their time listening to the DJ and that more than half of its first-time users returned the following day to use the service. However, these measurements are preliminary, as the feature has yet to be fully implemented in the United States and Canada.