(Additional contribution by Dan Charlson & Amy Vandergon)
Structuring data so it can be easily interpreted and transported is critical for modern digital music infrastructures. The most common structured data solutions are called Markup Languages, which take an exceptionally simple approach to structure.
Markup languages simply mark sections of a document (hence “markup”) with a descriptive label called a “tag.” These tags are then used by other software to properly display or ingest that data.
Markup languages typically consist of regular words rather than code syntax and symbols. This makes them more user-friendly, often possible to interpret by eye alone. The two most popular markup languages are HTML and XML. XML is also known by its long-form name: eXtensible Markup Language.
What is XML?
Markup languages don’t actually perform operations. Their job is to describe and organize data for software that understands those descriptions, and can execute the data. HTML is made up of tags (<head>, <body>, <div>, <p>, etc) that were agreed upon when the language was built. Every HTML developer must use only these tags to categorize data, or the software built to interpret and display the data (web browsers) will not render the elements correctly.
XML differs in that it allows a developer to create any tag they like. This is why it is known as extensible. XML developers are not limited to a predefined set of tags to describe elements of data. Data can be organized with any tag imaginable. While XML resembles HTML, its data structuring potential is far greater. In a sense, XML is a framework that allows you to create your own language for describing and transporting data.
Computer systems often contain data in incompatible formats. To move data between these systems, large amounts must be converted and incompatible data is often lost. Industries, organizations, and developer communities agree on XML specifications or standards for this reason. This makes the creation of compatible software programs easy, regardless of how they’re built or where they’re situated.
XML is a common choice for exporting structured data and for sharing data between programs or companies.
How does XML service digital music?
Let’s say Company A wants to send Company B information about a new music album. If they were using email, Company A would simply write the information down in an easy-to-understand human format. Company B could then take that information and enter it properly into their system.
Title – Album Title
Artist – Album Artist
Track 1 – Track Name
Track 2 – Track Name
Track 3 – …
But—what if Company A wanted to send the information from their system directly into Company B’s system? Since the systems are almost certainly not compatible, there must be an intermediary step where the data is re-written in a common language. This is where XML comes in. The two companies agree on a set of XML tags and their hierarchy (referred to as a “schema”), then map those values to their own systems. It may look something like this:
While to a human the data looks virtually identical, digital systems can process data written in a common tongue much more easily. XML allows digital music services to transmit and synchronize massive amounts of content information between incompatible systems easily and accurately.
How does MediaNet use XML?
As shown above, XML is a critical part of digital music data delivery to MediaNet. Without XML feeds, content data being added to or updated in our library would require large databases or spreadsheets. Each addition (or batch of additions) would require a human to package it and send it to us, where another human would then integrate it into our system.
Thanks to XML, this is not necessary. Every time a new piece of data needs to be added to the MediaNet catalog by our Content Partners, it is simply added to the feed and picked up by our system.
MediaNet maintains its own sophisticated XML content ingestion schema. It consists of over 30 top-level elements, expanded into 100s of metadata sub-values indicating data points such as rights, territory, currency, and usage for every track, album, artist, and composer. All told, a typical album can consist of more than 3,000 lines of XML markup—information that ensures data is ingested into our systems accurately and completely.
What challenges does XML pose?
While the beauty of XML is its dead-simple nature, that doesn’t mean there aren’t challenges in using it. Beyond the inherent possibility of coding errors, the most common XML challenge is also one of its most useful features: automation.
Using XML with custom software platforms means the schema must be agreed upon at both ends of the feed. Both tags and the acceptable values inside these tags are built into the automated software that receives them. This automation can easily be tripped up if the XML schema or acceptable values are improperly entered.
Most systems require XML information to be two things:
- Well-formed – meaning that it adheres to the XML spec itself
- Valid – that it properly follows the schema
Errors in either of these categories can cause XML data transfers to stutter or fail entirely.
How does MediaNet help solve those challenges?
Such automation and data input errors can, but don’t have to, halt or destroy XML data transport. MediaNet uses a 3-part system to resolve feed errors, enhance data, and keep transports running smoothly:
- Erroneous data is identified and filtered from the feed into a separate error queue, leaving the rest of the feed free to ingest into our system.
- Our Content Operations Team prioritizes and manages our error queue daily to avoid long-stay metadata errors, and resolves many in the process.
- Our Rights Management Team manually uncovers and verifies additional information and data to ensure rich, accurate entries.
While our systems are automated by using the most effective language for collaborative data exchange, ensuring the highest possible quality of data still requires a human touch.
MediaNet’s data is the cleanest, most accurate data in the music industry because our database is built on an unbeatable combination of enterprise technology and human intuition. We are almost always able to resolve XML feed errors without any need for Content Partner involvement.
XML is the humble technology that drives many of the most successful digital music technologies. Digital music infrastructure costs would rise dramatically without it. The degree of difficulty for integrating systems would increase.
Most importantly, the extensibility of digital music systems to flex and change with need would disappear.