When it comes to email, co-registration/hosted lead generation, incentive promotion sites, and search arbitrage, arguable no other group collectively knows more than those operating in our space. One of those specialties, co-registration/hosted lead generation revolves around the transfer of data. Most often, the data passed from one party to another in this manner follows a strict and rigid format. Advertisers specify exactly which fields they desire and the format for each field. Those familiar with the process, whether technologically inclined or not, know instinctively that they can send leads in real time or batch, and if in real-time it tends to occur by http post. Most business people don’t even think twice at the mention of other methods for sending data, but chances are the astute ones have started to hear XML and its related technologies RSS and podcasting more and more in non-techie environments.
Those familiar with HTML will recognize the “ML” in XML. As opposed to Hypertext, the “HT” in HTML, the “X” stands for Extensible, i.e. flexible. Similar to HTML, XML is a language. Unlike Perl or C, XML is not a programming language. Its results get interpreted, as opposed to outlining behavior. To better understand XML and its increasing popularity, it helps to take a step back and approach data transfer from the beginning. Almost as soon as computers started becoming mainstream tools for governments, universities, an ultimately businesses, those using them had a desire to send information from one computer to another. Governments, for example, published a set of standards on ways to communicate with them. They might dictate for instance that the first eight characters would stand for the first name and the next ten for the last name. Were your name “Mary,” you would send over MARY accompanied by four spaces because no matter what, the first name must be eight characters. Were one’s name too long, the standard dictated that it be truncated to fit within the eight character limit. Such a format is obviously limited.
Those in the lead generation space will recognize another common type of data transfer, comma separated values or CSV. Unlike fixed width, CSV accepts a variable number of characters. Whenever a field ends, simply put a comma. This represents a vast improvement over fixed width, but it too has limitations. Were one to use a central lead system for all advertisers, the limitations of CSV would start to appear. With CSV, all fields must be specified in advance and no room to add or subtract dynamically exists. If one lead buyer wants 25 fields and the other wants 6, the CSV file will need to have 25 fields. It can work so long as you know all fields ahead of time. This format starts to show its weaknesses though once you do not know all fields, which is especially true with hierarchical information.
To better understand the distinction; take an example of transmitting from one machine to another three generations of data for ten families. With less flexible data transfer methods such as CSV, this file would contain up to a hundred fields to accommodate for all the possible variations – one family might have ten kids each with five kids. Another might have just one in each. The result will look like a patchwork of names and blanks. With XML, this data file would simply describe only those that existed in each family. As mentioned, XML is a markup language and looks visually similar to HTML. Information is contained within tags; each set of tags is called an object. In HTML for example, you will find a <body><//body> tag. You might also find a <table></table> tag which requires other tags to display correctly such as the row tag <tr></tr> and the cell tag <td></td>. The same applies with XML, except as a rule, a tag can be anything you want so long as the program parsing it knows what each value means.
Take the family example from above. The main object might be <family></family>. They might have an id associated with them, which would be a property of the main object, e.g. <id>1</1>. Another property might be <generation>. It too would be an object that belonged to the main object <family>. Each person in the generation might have a property and so on until one big hierarchy underneath the main family existed. As there are ten families in this example when all is said and done ten family tags would exist. Each could be expanded to reveal the sub-properties. A format such as this makes it very easy for the other computer to receive this data and know what to do with it – quickly and repeatedly. Almost as interestingly, almost anyone can create his or her own language in RSS. The family example if executed would be one such language.
Given that data represented using XML can be read, produced and parsed quickly, and that XML provides an easy platform for the creation of unique languages, this leads the discussion to RSS or Really Simple Syndication. Those in our space most likely have more familiarity with RSS as an upcoming advertising medium than they do with it as a reader. RSS is, however, one of many XML languages. Its main object is the tag <feed> versus the <family> tag in our example. The <feed> tag contains several sub-objects that include name, description, and item – all the essentials for listing articles, each article in this case being an <item>.
What RSS provides is a way to subscribe to web content without having to enter any personal information. More than that, it allows you to take in much more information in less time by providing a method for having it viewed from a unified location – your preferred RSS reader, one of them being AskJeeve’s Bloglines. Subscribe to an RSS feed for the technology section for MarketWatch.com, and what you receive is an XML document from where you tell your RSS reader to grab data. Do the same for fifty different columns if desired, and view them all in one place. Almost anything that gets published serially makes for a good candidate, from comic strips to music files. It’s the latter that brings us into podcasting. While complex in name, it refers to an RSS feed that contains one other thing than a column would – a link to a music file. Just as web browsers can handle both HTML and JavaScript, many RSS readers can follow the link to the music file and download it for you. The Mac contains a built in RSS reader that sends these music files to iTunes so that they can be downloaded to the iPod, hence “pod” casting.
XML matters because a fundamental shift in the way information takes place is occurring. Those in our space have excelled not just by finding inefficiencies in price but also by understanding how information takes place. Talk to recent graduates and they do not know of the Internet bubble. They do not know about much of what drove our success. They don’t enter their names for freebies. They do read blogs. They also write them. They are the super users – ones who publish data and consume data in ways that did not exist as little as two years back. They are not yet the majority, but they will be, and we need to know how to reach them.