Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Metadata is the biggest little problem plaguing the music industry (theverge.com)
89 points by cpeterso on June 1, 2019 | hide | past | favorite | 47 comments


This is the problem right here:

>The musician who was owed $40,000 missed out because a glitch between two databases removed many of his credits. It wasn’t the musician’s fault, but too much time had gone by before anyone noticed. The companies involved declined to pay him.

If the people who owe the money overtly don’t care about doing the right thing, which seems to have been the typical attitude since the recorded-entertainment business was invented, who’s going to be able to fix it? Seems like this is working exactly as designed.


Getting a major record company to just follow through on the payment of a clearly contracted project is a hassle on its own. It's like they just want to weed out paying any musicians who can't afford to have a lawyer follow up.


Music executives are some of the biggest scumbags around. I’m sorry for my harshness, but I’m trying to be polite. They’re very predatory.


Absolutely. Surprised the music industry hasn't had its own 'Harvey Weinstein' moment because the same stuff happens there.


That's when that artist should tweet the story, along with an admonishment to definitely NOT steal their music

"Definitely do NOT pirate this material just because I won't be paid either way, it's still illegal! I cannot be more clear on this to you, my fans who normally pay for my music"


One has to wonder if it would hurt their credit ratings to not pay their bills to musicians.


Companies always pay when it hurts them not to.

When running a service business the best decision I made was using a factoring service from Welles Fargo.

Big company might specify NET180 on a big contract and still not have remitted payment 200 days in. If you are a small company your recourse is to sue.

If you have a big bank handling accounts receivable they call on day 181 and tell the company that their credit rating is taking a hit for not making payment on time.

Payment gets made. Big companies like good credit ratings, they don’t like paying more to borrow money.

Seems like there is a market opportunity for artists to have a financial partner handle collections on their behalf.


Even if they cared, it might be very hard. The people in that industry are surprisingly not tech savvy, moreover, often, there's very little they can do about it, or worse, very little access to detailed information themselves.


>The people in that industry are surprisingly not tech savvy

As the quote goes, "It is difficult to get a man to understand something, when his salary depends upon his not understanding it" - I'm confident that things would be entirely if the industry suddenly found it in its best interest to become tech savvy.


That's like saying it's OK for me to pirate music because I can't tell the difference between Spotify and PirateBay.

But as long as lobbying is a thing and those companies have a deep pocket I have no expectations that a solution (which I'm sure already exists) will be implemented.


This is like when you walk into a super complicated legacy code base and immediately have the one magic architectural change that will simplify everything...once you rewrite it from the ground up.

The music metadata situation is pretty bad, but the source of the problem is not really carelessness or greed or avoidance of responsibility (although those are all true). The true source of music metadata complication is the insanely complex copyright regime that music operates under. It's a legacy codebase about a century in the making that is constantly being patched up by congress, mostly by trying to change who is being protected from whom. (Among the folks favored at different times: labels, publishing companies, performing artists, song writing artists, radio stations, streaming music services, live venues, ...).

Perfect compliance with these laws is effectively impossible, so everyone is just doing the best they can. And any attempts congress makes to change how things work end up being gigantic legal battles because it's a zero-sum game and the more money in the "right" hands (e.g. these artists being ripped off) is less money in the other "right" hands (e.g. the unprofitable streaming service we all love).


"Old Town Road" is a useful modern reference for what we could have in a world without Copyright. That sample behind everything is from 34 Ghosts IV, one of the tracks in Nine Inch Nails' Ghosts I-IV album.

Strictly just pasting it into a song and selling it wasn't legal (presumably after this went viral somebody paid Trent Reznor a bunch of money and put his name in the metadata to avoid nasty legal consequences) but everything up until selling it was legal, all the raw PCM data you'd want to take the samples apart without painfully cutting it out of the entire Ghosts recording was uploaded with the Ghosts I-IV album as CC-NC-BY. This is how our culture was _supposed_ to work if we weren't still trying to find ways to put more money in The Man's pocket.


Meanwhile, Lou Reed owned 100% of “Can I Kick It?” royalties. So much good art doesn’t get put out because of greed.


Wait are you using the “Can I Kick It” analogy to argue in favor or against greed stopping art?

Because that example is a really weird grey area. The label didn’t clear one of the most obvious samples of all time. The original artist did not prevent them from releasing the derived work, but Tribe didn’t get paid for one of their classics.

I’m not sure what to think of that story.


Honestly I'm rather against needing to clear samples at all. I think the art should come before any concern that it would put someone into debt via a lawsuit—it should never result in making no money off it when the value is clearly in the product of the sample + performance.


I’m going to go out on a limb and guess that the label’s metadata is always present and accurate. Labels have been screwing artists for over a century.


Having recently dealt with data from music labels, I can sadly assure you that it isn't always accurate. But they get enough money as it is, they're not going to spend extra on internal data hygiene...


Ohh it's accurate in the sense that it records who is the owner, the record label. On the other hand fixing the artists and others metadata is only needed to pay off others which only reduces their profit. If they needed the correctness for getting paid themselves you can bet anything that the metadata would have been correct.


> Ohh it's accurate in the sense that it records who is the owner, the record label.

Oh, you'd be amazed...


Industry rule #4080: Record company people are shady...[0]

[0] https://www.youtube.com/watch?v=BQT2DfzpCLA&t=2m54s


>the label’s metadata is always present and accurate

Based on my tangential experience with this part of the industry... it absolutely is not.


I think [MusicBrainz][1] deserves a shout out on this topic.

[1]: https://musicbrainz.org/


Different kind of metadata, I think; the Verge article seems to be talking about the data that goes to PROs[1] and CMOs[2] which is then used to calculate splits and payments based on data from people like Spotify, Apple Music, etc. ("they had missed out on payments for 70 songs, going back at least six years" is stuff I've seen before w.r.t data just not being sent to the PRO/CMOs from a label.)

[1] https://en.wikipedia.org/wiki/Performance_rights_organisatio... [2] https://en.wikipedia.org/wiki/Collective_rights_management


Indeed, MusicBrainz contains the names of songwriters, but not the organisations they belong to


This seems like a good application for a publicly auditable, immutable data store. Artist can birth a hash of the work into store along with their metadata. Post production steps would build on that record, creating a derivative work with new metadata. Labels, ditto. Distribution contracts go on top of that. When it finally gets a performance, that could be a transaction on top of that. Anyone can follow a performance back to its creators.


There are so many technical and non-technical problems with this, I don't even know where to start.

What's a hash of the work? Lyrics? Original score sheet (if there is any)? Any combination of the two? The first studio recording? Some songs may list a dozen collaborators, how's that attribution gonna work?

What's a hash? A hash of the resulting wav? mp3? flac? alac? any combination of the above? hashes of individual instrument tracks?

At which point does a song stop being derivative and becomes an original song (or music in general, see "Dies Irae" for example [1] or even Star Wars [2]). Who and how keeps track of attributions when lyrics are written by one person (or people), music by another, and performed by dozens of different performers over the years? How are collections tracked and attributed (an opera is a collection of musical performances viewed as a single work)?

These are just some of the problems people that try to track and attribute music face today. And they are not going to be magically solved by a "publicly auditable, immutable data store".

[1] https://en.wikipedia.org/wiki/Dies_irae#Musical_quotations

[2] https://www.nytimes.com/2017/09/14/arts/music/star-wars-soun...


Nobody's claiming a magic bullet. Nothing you mention invalidates the idea, I think.

For the first part(the nature of the "hash"), there should obviously be standards to address these questions. An official recording could be identified by a hash of the actual recording in some standard format(e.g. raw wave files).

For distribution of score sheet and lyrics, those can be treated separately and have their own hash identifier according to their own standard.

The rest are philosophical and legal questions that are orthogonal to the technical problem. The point of the suggestion, is to address the technical side of the problem.

Having an open platform where authors/copyright owners can hold public records of their copyrighted works and their distribution, and point to it as legal evidence, would certainly be useful, don't you think?


> The rest are philosophical and legal questions that are orthogonal to the technical problem. The point of the suggestion, is to address the technical side of the problem.

That's the problem: you're addressing the most boring and the least important part of the problem that has already been solved. No one in this world has any trouble tracking and attributing songs once a song's metadata is correct. And providing correct metadata is the entirety of the problem which you just glossed over as "philosophical and legal questions that are orthogonal to the technical problem". They are not.


Immutability is quite nearly the opposite of useful here. The most common way for data to be wrong is at the interface boundary: it was entered wrong in the first place.


Immutability, the way I think the author means it, doesn't mean data can't be changed ever, in any way. It just means that modifications(e.g. corrections) cannot completely overwrite previous values without leaving any trace, but instead have to appear as amendments, visible alongside the previous versions of the data.

If it was entered wrong, you can issue a correction, but a trace of the first, wrong value will stay available.

That means nobody can just say something was never what it was, just that it changed(and then have to justify why).


An interface should avoid unintentional mistakes. Easier to detect invalid signature or hash paired with other signature or hash versus an incorrect name or other ID.


I know books have ISBN numbers, is there no similar system for CDs? (Or is that model outdated in an age of buying singles on iTunes)


A book is usually a single work attributable to a single author (or a group of authors). And you don't usually stream a book as "let's stream pages 175 to 185" [1]

In case of music assigning ISBNs to CDs doesn't really work because people more often than not listen to, stream and broadcast individual tracks. And there's no global ID system for individual tracks. This becomes even more messy when you consider that "Original performance 1983", "remastered 1994", "best of 2003" and "Japanese Christmas Special 2013" are all different tracks (even if it's the same track with the same song), often has different distributors and rights holders, and people will fight to the death defending a particular version :)

[1] Audiobooks are a slightly different beast, but they are still much easier to match to an author.


> And there's no global ID system for individual tracks.

https://en.wikipedia.org/wiki/International_Standard_Recordi... is exactly this, no?


Ah thanks. I (wrongly) assumed tracks also got some kind of ID.


Yes, they’re called ISRC ids. But while they’re used to uniquely identify audio recordings, they don’t solve the problem of metadata being wrong or getting stripped during transfers


This is a great idea in theory, but how do you come up with an immutable hash of a song that a user can enter into the database? Even if you have a recording, most music these days is stored in a lossy format. If you have the MP3 version, the Apple AAC version, and some high quality .flac version, they're all going to sound mostly the same, but the underlying data is not even close to identical.

How do you hash the audible result, in a reproducable way? I'm not trying to bash on the idea mind, I just think this is an interesting problem and I'm curious if solutions already exist. Surely Shazam and the like have to be doing something in this vein.


> how do you come up with an immutable hash of a song that a user can enter into the database

I believe gracedb has this; we’ve had decent audio fingerprinting for a bit now.



>This seems like a good application for a publicly auditable, immutable data store

Blockchain?


I wrote a very niche App for people who like Opera. It's most important purpose is to display Opera track listing meta-data correctly. (You can see comparison screen shots at https://ariascribe.com). Along the way, I discovered that, even though the genre "opera" has long been recognized, and even has an official ID3v1 tag (103), I have never ever seen the tag used in CDs or downloads. My App has to figure out whether or not an album in a users collection is an Opera. Should have been trivial. Was in fact hard. Very hard.


Hearing lots of people mention blockchain technologies as a solution here.

Enter Passport by Mycelia[0]

---

[0]http://myceliaformusic.org/


Is this one of the very (very) few cases where blockchain could actually make sense? It sounds like you just need some central store to get (cryptographically verifiable, with revision history) metadata.


This is, incidentally, Imogen Heap's Next Big Idea: to give each musician a sort of CV of record on the blockchain called a "creative passport", such that when they present the passport to a record label or other business partner, everything they worked on or contributed to is visible and attested to. It's a big idea, and I don't think even she is aware of the various pitfalls (blockchain is energetically expensive and cannot fix bad data on its own), but dreaming big is just how Imogen Heap do and it may call attention to the problem in a way that will bring forth actual solutions to parts of it.


Blockchain can't fix bad data entry.


We need a template like the spam one for the obvious "this is why blockchain(s) won't help here" suggestions.


I'm actually very negative on blockchain - as it is clearly a interesting technology but one that is desperately looking for a problem to solve.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: