Getting Close to Proximity

My company is currently involved in a database project for an automotive client that needs to select customers and prospects who live within a given radius of one of their dealers. This requirement is also common for retail businesses that want to solicit people living within a certain distance of a store. These kinds of selections are known as “proximity” or “radiating” selects.

The way most people run proximity selects is to simply pull everyone who lives within either the same ZIP code as the designated site or an adjacent one. The drawback to doing it this way is that it’s a bit of a pain finding and entering the ZIP codes you want, and it’s not all that effective when dealing with very large or very tiny ZIPs. (While a ZIP can cover an area as small as a single building, I found one for Burns, Oregon that takes in 1,169 square miles!)

Wouldn’t it be so much easier if you could just input something like: Dealer = X AND Distance < 10 to pull everyone within 10 miles of dealer X? Well, thanks to the Scarecrow and yours truly, now you can! The Scarecrow, of course, is none other than the original inventor of the Pythagorean theorem, which he first uttered in the unforgettable line from the movie “The Wizard of Oz”: “The square of the hypotenuse of an isosceles triangle is equal to the sum of the squares of the other two sides.” (Note: Mathematical purists will point out that the actual formula calls for a right triangle, not an isosceles one, but “isosceles” sounds snappier.)

You may be wondering what “The Wizard of Oz” has to do with radiating selections. Well, according to my beloved 7th grade algebra teacher Mr. McAlpine, the Pythagorean theorem allows us to employ something called the “distance formula” to calculate the distance between any two points.

Here’s what the formula means: If we have the latitude/longitude coordinates of our dealers as well as our customers, we can just crank them through the distance formula to figure out how far apart each customer is from a given dealer, right? Well, almost. You see, lat/long coordinates are generally given in decimal degrees, which is 1/360th of a circle. Unfortunately, we need an answer in miles.

When originally confronted with this problem, I adopted my customary never-say-die attitude and devoted a full two minutes of fierce concentration before giving up. Luckily, however, there are many smart people out there who’ve figured out the solution. A search online of “distance formula latitude” will find you many sites with the formula. Believe it or not, you can actually find a formula that takes two sets of latitude/longitude coordinates and returns the distance between them in miles.

How do you get lat/long coordinates if you don’t have them? Fear not, there are loads of service bureaus that will gladly take your money and append them to your file by matching on street address. If you don’t need extreme accuracy, the post office has something called the Tiger (Topological Integrated Geographic Encoded Referencing) file that you can use to append lat/longs at a ZIP and ZIP+4 level. Basically this means that everyone inside the same ZIP+4 will share the same lat/long. The accuracy isn’t as good, but at least you don’t have to find and input lists of ZIP codes whenever you run your selects.

Now that we’ve put latitude/longitude coordinates on our file and learned a formula to compute distances between them, how do we build a database query that pulls everyone within 10 miles of a dealer? Simple. Rather than trying to key in that square root/degree conversion thingamajiggy, let’s delegate the problem to our friendly database administrator, whose merits have so often been lauded in this column, and have her create something called a view, which is something that looks like a table but isn’t. (More on this subject in a bit.)

When asked politely, our DBA will cheerfully create a view called Customer_Dealer in just a few minutes that (a) pulls customer data along with dealer data, and (b) embeds our lat/long distance formula so that we don’t have to mess around with it. To do so, in addition to pulling the data from these two tables, she’ll simply include our formula as an “expression” and give it a title of Distance. Once this is done, we can enter Dealer = ‘X’ and Distance < 10 to run our select against this view.

Views provide a pretty cool way of dealing with messy problems like this. Basically, think of a view as an SQL query that’s assigned a name such as Customer_Dealer. Views look and act just like database tables, except that they “execute” (run) when you reference them. I like to use views to hide marketers from the yucky complexities of database design. In this example, our view retrieves data from two different tables, and also embeds a calculation that uses lat/long points from each. Other common and convenient uses for views include:

  • Applying standard suppressions from the DMA pander file, the telemarketing suppress file and the deceased persons file.

  • Linking together commonly used combinations of tables.

  • Embedding other formulas, such as age calculations based on the difference between the current date and a given date in the file. For instance, a publisher might calculate the days until a subscription expires by subtracting today’s date from the expiration date, returning the difference in days. This allows it to run “renewal” campaigns to select everyone whose subscription expires within 90 days.

  • Enforcing database security. For example, you might restrict a dealer or location such that they only see data that belongs to them.

    You also might consider using the distance formula if you’re trying to:

  • Create a “home” dealer field and store it on the base customer table. This is simply the geographically closest dealer to each customer’s address. This field makes it fast and easy for dealers to analyze and market to customers who are “assigned” to them.

  • Calculate and store the distance in miles to each customer’s home dealer. Doing this improves the performance of your radiating selects drastically, since it doesn’t have to be recomputed at run time.

  • Examine instances where people live much closer to one location yet prefer to do business with another. This might help you locate trouble spots (also known as “slackers”) in your sales and service departments.

Impress your boss by reciting the Pythagorean theorem at the next company meeting. Just make sure he hasn’t watched “The Wizard of Oz” recently.

Jeff Fowler is president and founder of Decision Software Inc., Lanham, MD.