List Hygiene

Toward Superior Merge/Purge

A superior merge/purge operation can save business-to-business mailers money and even improve the profitability of their prospecting efforts. The challenges involved with such a program are much greater than those of a consumer merge. For example, no one field can be relied upon to link duplicates while in consumer merges, address is a good link to combine records with disparate names.

Many businesses have post office boxes and street addresses for one location, so address elements alone are not enough to identify matches. Further, businesses often have several operating names, so company name is not enough of a link to identify all duplicates. Only by considering these and many additional elements is a superior B-to-B merge/purge created.

We evaluated nine B-to-B service bureaus. Each was provided with identical input files and instructions. The bureaus were asked to perform an employee-level merge and a one-per-company merge.

In addition to standard reports, we received a complete listing of the net output records and duplicate sets at both the employee and company level in specified geographies. These reports were used for intensive manual review and auditing.

An important element of merge/purge is list conversion. That’s because the quality of data coming from list rental fulfillment files is highly variable. This variability becomes more of an issue when rental orders are fulfilled directly from a list owner’s in-house fulfillment system.

To test conversion capabilities, some fields were scrambled on the input files: employee names placed in the company name field, titles in the employee name field, etc. We found that inadequate file conversion generally resulted in the presentation of bad-looking address information. In several cases, suite numbers ran into street numbers.

Merge/purge software can usually identify a large percentage of duplicates, even when suite number information is together with street information. However, problems parsing suite numbers have the potential to lower the quantity of records coded with ZIP+4. This lessens a mailer’s ability to achieve postal discounts.

Some systems that were unable to parse information correctly dropped the records altogether. Others dropped few or no records, even though they were unable to correctly parse data. Both occurrences are undesirable.

We found less variability in quality of conversion than in the merge/purge. This differs from previous tests that showed as much variability in conversion capabilities as in merge/purge.

One bureau had to rerun the test because it consistently left job title information in the individual name field in its first attempt. This common problem can result in many unique records appearing as duplicates because they had the same title in the name field.

The results of the merge/purge analysis showed a large difference between the service bureaus. The better software identified up to 62% more duplicates at the individual level than other participants. This difference resulted in a net output file that was 8% larger for the poorer merge compared to the best merge. There are few mailers who can afford a campaign that mails 8% waste.

It is important to note that the merge that identified the most duplicates was not the best. Many of the duplicates in the merge with the lowest net output were records that were incorrectly matched. This is overkill. An example is: