14% Billy Joel, 8% Elton John: How Neural Embeddings Could Revolutionize AI Royalties

Recently, an AI-generated band called "Velvet Sundown" racked up over 1 million plays on Spotify before anyone realized they weren't real. When we ran their tracks through our voice analysis system, something fascinating emerged: each "AI band member" had a distinct vocal identity that our models could reliably detect and, in some cases, match to real artists.

The t-SNE plot below shows voice embeddings for all 42 Velvet Sundown recordings. Notice how tightly clustered they are—each AI vocalist maintains a consistent identity across the track.

For some of these voices, our system was able to match the AI vocals to specific real artists—revealing that one Velvet Sundown track showed strong similarities to David & David, while another mapped closely to R.E.M.'s vocal characteristics, with traces of America's style.

Recently, an AI-generated band called "Velvet Sundown" racked up over 1 million plays on Spotify before anyone realized they weren't real. When we ran their tracks through our voice analysis system, something fascinating emerged: each "AI band member" had a distinct vocal identity that our models could reliably detect and, in some cases, match to real artists.

The t-SNE plot below shows voice embeddings for all 42 Velvet Sundown recordings. Notice how tightly clustered they are—each AI vocalist maintains a consistent identity across the track.

For some of these voices, our system was able to match the AI vocals to specific real artists—revealing that one Velvet Sundown track showed strong similarities to David & David, while another mapped closely to R.E.M.'s vocal characteristics, with traces of America's style.

Here's an uncomfortable truth: every major AI music service trains on copyrighted material scraped from the web. These services are trending towards generating billion dollar valuations, and billions of streams, while the human artists whose work powers the AI get nothing.

This isn't just about catching deepfakes. It's about solving a much bigger problem: how do we fairly compensate the artists whose work trains AI models?

Royalty models for the age of AI

At Sound Patrol, we’re obsessed with providing the most cutting edge analyses possible of music, with an emphasis on understanding copyright infringement stemming from genAI.

But what we’ve found is that there is a continuous spectrum in degrees of similarity, ranging from blatant infringement to subtle influence.

We think that genAI is here to stay and that, eventually, there will be a need for new, innovative royalty models to support equitable use of this compelling, creative technology.

A Concrete Proposal

While it isn’t our bread and butter, the potential for these new royalty models has been back-of-mind for our team for years. A few principles we’ve come to believe should guide a world class, equitable royalty model for genAI:

It should examine both AI model inputs (prompts) and outputs (similarities and infringements)
It should be fractional - royalty payments for a genAI song should be split among a number of artists whose work contributed.
It should be scientifically principled - guided by the most cutting edge research.

Such a royalty model would be only one component of an ethical approach to genAI which respects artists rights - but it could be key, and perhaps the component which is most technically difficult to get right.

Analyzing the genAI Inputs

Over a year ago, we developed a method that makes training data influence explicit and measurable, rather than treating text conditioning as a black box. Every training track was paired with a text description that we embedded using e5-mistral-7b-instruct. Those embeddings were then grouped into 430 distinct clusters via k-means clustering.

During training, we prepend special tokens to each track that specified which clusters should influence it, and how similar the track is to each cluster centroid. At inference time, we then knew exactly which training data subsets the model had been trained to draw from.

When someone prompts "generate a high-energy piano-driven rock tune from 1980," we trace the output through cluster mappings, artist percentages within each cluster, and distance-weighted influence scores. Result: "14% of this track was influenced by Billy Joel, 8% by Elton John, 6% by Bruce Springsteen..."

Analyzing the genAI Outputs

Training data influence is only half of the equation. When AI outputs resemble existing songs too closely—like our Velvet Sundown analysis revealed—we need output similarity analysis. Our voice analysis system is one way in which we measure these similarities in real-time.

Conclusion

These technologies address AI music generation from both ends—the training data that shapes a model's understanding, and the outputs that may closely resemble existing works.

And these new forms of attribution in support of royalty payments aren't just technically feasible; I think they are a necessary, missing piece of the puzzle. AI represents a fundamental shift in how art is created; it’s sort of like the advent of the electric guitar, and a bit like the transition from CDs to streaming. The current model for genAI art, where companies profit while artists get nothing, is unsustainable.

We're not claiming perfection. Influence attribution involves countless implementation details which could affect how royalties would be distributed, and edge cases abound. But imperfect attribution beats no attribution, and the technical barriers aren't insurmountable—they're just expensive enough that genAI companies haven't bothered.

The Velvet Sundown incident is one proof point, showing we have the tools to detect AI content and trace its influences. Now we need the will to use them fairly.

Recently, an AI-generated band called "Velvet Sundown" racked up over 1 million plays on Spotify before anyone realized they weren't real. When we ran their tracks through our voice analysis system, something fascinating emerged: each "AI band member" had a distinct vocal identity that our models could reliably detect and, in some cases, match to real artists.

The t-SNE plot below shows voice embeddings for all 42 Velvet Sundown recordings. Notice how tightly clustered they are—each AI vocalist maintains a consistent identity across the track.

For some of these voices, our system was able to match the AI vocals to specific real artists—revealing that one Velvet Sundown track showed strong similarities to David & David, while another mapped closely to R.E.M.'s vocal characteristics, with traces of America's style.

Recently, an AI-generated band called "Velvet Sundown" racked up over 1 million plays on Spotify before anyone realized they weren't real. When we ran their tracks through our voice analysis system, something fascinating emerged: each "AI band member" had a distinct vocal identity that our models could reliably detect and, in some cases, match to real artists.

The t-SNE plot below shows voice embeddings for all 42 Velvet Sundown recordings. Notice how tightly clustered they are—each AI vocalist maintains a consistent identity across the track.

For some of these voices, our system was able to match the AI vocals to specific real artists—revealing that one Velvet Sundown track showed strong similarities to David & David, while another mapped closely to R.E.M.'s vocal characteristics, with traces of America's style.

Here's an uncomfortable truth: every major AI music service trains on copyrighted material scraped from the web. These services are trending towards generating billion dollar valuations, and billions of streams, while the human artists whose work powers the AI get nothing.

This isn't just about catching deepfakes. It's about solving a much bigger problem: how do we fairly compensate the artists whose work trains AI models?

Royalty models for the age of AI

At Sound Patrol, we’re obsessed with providing the most cutting edge analyses possible of music, with an emphasis on understanding copyright infringement stemming from genAI.

But what we’ve found is that there is a continuous spectrum in degrees of similarity, ranging from blatant infringement to subtle influence.

We think that genAI is here to stay and that, eventually, there will be a need for new, innovative royalty models to support equitable use of this compelling, creative technology.

A Concrete Proposal

While it isn’t our bread and butter, the potential for these new royalty models has been back-of-mind for our team for years. A few principles we’ve come to believe should guide a world class, equitable royalty model for genAI:

It should examine both AI model inputs (prompts) and outputs (similarities and infringements)
It should be fractional - royalty payments for a genAI song should be split among a number of artists whose work contributed.
It should be scientifically principled - guided by the most cutting edge research.

Such a royalty model would be only one component of an ethical approach to genAI which respects artists rights - but it could be key, and perhaps the component which is most technically difficult to get right.

Analyzing the genAI Inputs

Over a year ago, we developed a method that makes training data influence explicit and measurable, rather than treating text conditioning as a black box. Every training track was paired with a text description that we embedded using e5-mistral-7b-instruct. Those embeddings were then grouped into 430 distinct clusters via k-means clustering.