Cash From Chaos

MOST DATABASE records can be ranked alphabetically or numerically by information in their fields. Related data, such as ZIP codes, can be clustered for targeted marketing efforts. Data found in “comments” fields, however, is difficult to quantify or code.

PerformanceData, the information services unit of Chicago-based consumer information supplier Trans Union, recently added structure to the chaos of comments fields on 14 million records. As a result, it wound up with a completely new product to sell without having to purchase any additional data.

When PerformanceData approached Trillium Software, a division of Billerica, MA-based Harte-Hanks in September 1997, PerformanceData had already chosen a rival company to do a cleaning of 28 million names from its 163-million-name consumer database. But due diligence revealed that the company, while strong on name and address correction, left something to be desired on less-quantifiable data.

“Our intention was to get vehicle data,” says Steve Olson, a programmer/research analyst with PerformanceData. While other sources, such as Polk, could provide some of the automotive information PerformanceData was looking for, several states have placed limitations on the type of data available from their motor vehicles department. Besides, the data was already in PerformanceData’s computers. It just had to be recast into a usable format.

Comments fields had been compiled primarily from banks and lending institutions. There had not been a standard method of recording vehicle information. A Ford Explorer, for instance, might be referred to as FRD, FRD EXP, 89FRD (for a model year of 1989) or even by a vehicle identification number.

“Through context-sensitive programming [involving scanning, mining and pattern recognition] and having prebuilt [basic vehicle code] tables, we were able to provide PerformanceData with a solution immediately,” says Trillium director of marketing Leonard DuBois.

In its initial scenarios, PerformanceData had hoped to recover such information on 9 million records, at best. When Trillium’s work was finished, make, model or year data was discernible for 14 million of the 27 million records submitted.

Trillium’s cleaning allowed new fields, with more detailed information, to be set up. “We didn’t have any of it on the records before. We had tags [on many of the records] that would say whether or not you had an auto loan, or had leased your vehicle. Nothing on make, model or year of the vehicle,” says Olson.

While PerformanceData currently offers only automobile data, the company eventually intends to supply data on motorcycles, boats, campers and recreational vehicles.