[ic] Donation: Geocode and proximity usertags (GIS functions)
Christopher Wenham
cwenham at synesmedia.com
Wed Dec 17 21:47:15 EST 2003
I recently had to develop some GIS-ish functions for a couple of
projects. The following usertags are a result, which I've released
under the GPL.
Geocode is a wrapper for a third-party service at
http://www.geocoding.com/, which will take any street address in the
US and convert it to Earth coordinates (Latitude and Longitude, like
on a GPS).
Proximity makes use of a table full of geocoded street addresses to
find stations within a search radius centered on a Zip code or another
pair of coordinates. EG: "Find all stores within 10 miles of zip code
11756". Since it searches by Earth coordinates, it will find stations
that may not be in the same zip or state, but still be within the
given radius.
With the two you can create a classic "store finder" quickly.
The third-party geocoding service isn't expensive, running at about
$90 for 3,000 geocoding credits. To make zip-code searches work with
the proximity tag you'll need a table of zip codes and their
corresponding Earth coordinates. The US Census Bureau provides such a
table for free on their web site (vintage 1990), but I've packaged a
version ready for use with Interchange on my web site:
http://www.synesmedia.com/twiki/bin/view/Cyc/Proximity
UserTag geocode Order address city state zip
UserTag geocode addAttr
UserTag geocode Documentation <<EOD
By Chris Wenham of Synesmedia, Inc. - www.synesmedia.com
This software is distributed under the terms of the GNU Public License.
Version 1.0, December 15, 2003.
Geocode a US street address, using a third-party online service at
http://www.geocode.com/ (Tele Atlas). Returns Earth coordinates in
latitude and longitude. Useful for building 'store finder' utilities,
and for saving you the chore of hitting the streets with a GPS receiver.
Installation
1) Go to http://www.geocode.com/ and sign up for their Eagle Test Drive.
You will get a username and password and should also get a few free
geocoding credits for testing with.
2) Go to their Download area and download the "EZ-Locate Sample Source
Code v1.xx". When you compile this on your Linux/Unix machine it should
produce an executable called "rie".
3) Add the following variables to your Interchange catalog's variables.txt
GEOCODE_PATH /path/to/rie Geocoding
GEOCODE_LOGIN your_login Geocoding
GEOCODE_PASS your_pass Geocoding
4) Put geocode.tag in your Interchange's code/UserTag directory. Restart.
5) Pay the nice people at geocode.com for some more geocoding credits.
Synopsis
<p>The Empire State Building is at lat./lng. [geocode
address="350 5th Ave"
city="New York"
state="NY"
zip="10118"
] on your GPS unit</p>
Will print:
The Empire State Building is at lat./lng. 40.748318,-073.985223 on
your GPS unit.
Will return a comma separated latitude and longitude. This can be parsed
and stored in separate lat/lng fields in your addresses table so you can
use it later with the [proximity] tag. It could also be passed untouched
into the origin="" parameter of the proximity tag for use when doing a
radius search, like this:
[proximity origin="[geocode address='...' city='...' state='..' zip='..']"]
<p>{ADDRESS}, {CITY}, {STATE} {ZIP}</p>
[/proximity]
At the time of writing, Tele Atlas provides about 3,000 geocoding credits
for $90, and it's cheaper per coding if you buy more. However, since
buildings don't generally pick up and move anywhere, it might make sense
to cache the results or even store them permanently so you don't waste
credits.
Side effects
If you don't feel like parsing the results, the lat and long are also set
in the following temporary scratch variables.
gc_latitude
gc_longitude
The latitude in longitude, in 15-digit precision.
gc_popdens
Tele Atlas gets a lot of its data from the US Census Bureau, and
they've seen fit to include some bits and pieces in the results for
each address. I have no use for most of this, but the population
density rating looked interesting, so I decided to set it as a
side-effect variable just in case it became useful.
The possible values are:
U = Urban
R = Rural
Blank = unspecified
Future improvements
This tag should probably also parse and store the Postal Standardized
version of the adddress, plus the match type.
EOD
UserTag geocode Routine <<EOR
sub {
my ($address,$city,$state,$zip,$opt) = @_;
#Debug("Geocoder inputs: $address, $city, $state, $zip, $opt");
my $cmd = $Variable->{GEOCODE_PATH};
my @args;
push @args, '-l';
push @args, '-u';
push @args, $Variable->{GEOCODE_LOGIN};
push @args, '-p';
push @args, $Variable->{GEOCODE_PASS};
push @args, "-g $address|$city|$state|$zip";
#Debug("Args: $cmd ". join(' ', @args));
my $out = '';
open(GEO, '-|') || exec ($cmd, @args);
while (<GEO>) {
$out .= $_
}
my ($count,$header,$data) = split /\n/,$out;
my @results = split /\|/,$data;
if (@results) {
my ($lat,$lng,$pop) = ($results[7],$results[8],$results[21]);
$Tag->tmp('gc_popdens',$pop);
$Tag->tmp('gc_latitude',$lat);
$Tag->tmp('gc_longitude',$lng);
#Debug("About to return");
return "$lat,$lng";
} else {
return;
}
}
EOR
UserTag proximity Order zip
UserTag proximity addAttr
UserTag proximity hasEndTag
UserTag proximity Documentation <<EOD
By Chris Wenham of Synesmedia, Inc. - www.synesmedia.com
This software is distributed under the terms of the GNU Public License.
Version 1.0, December 15, 2003.
Find locations within a search radius of an origin, specified either as
a zip code or a latitude/longitude pair. Format the results with a user
supplied template.
This tag and others can be found at
http://www.synesmedia.com/twiki/bin/view/Cyc/Proximity
Requirements
* Zip code lookup table (may have come with this tag, otherwise, check
the US Census Bureau's web site to get a free one:
http://www.census.gov/geo/www/gazetteer/places.html )
* A table with geocoded locations. (Suggesion: add numeric "lat" and
"lng" fields to any table of street addresses, then use a geocoding
service such as http://www.geocode.com/, or a handy GPS unit and a
lot of walking, to find their Earth coordinates)
Store and pass coordinates expressed in degrees with double precision
(15 digits), where zero on each plane is the Prime Meridian (longitude)
and Equator (lattitude).
Synopsis
[proximity zip="90210"]
<p>{NAME} <br>{ADDRESS}, {CITY}, {STATE} {ZIP} is only {DISTANCE} miles from you.</p>
[/proximity]
Arguments
zip="90210"
Any zip or postal code that will match up against an entry in your
Zip lookup table. This tag will expect to find GPS coordinates in
that table's "lat" and "lng" columns.
If you provide a US zip code, this tag will first attempt a direct
match, but if it can't find the zip in your lookup table, it will
search up and down until it does find one.
If you provide a zip code, you don't need to use the origin argument
below.
origin="-73.2412348,140.33458734"
GPS coordinates to use instead of looking up a Zip code. Specified
as lat/lng. Accepts the same format as the output of the [geocode]
tag.
max_radius="10"
Maximum radius, in miles, to search within. Defaults to 10.
min_radius="0"
Minimum radius, in miles. Defaults to 0.
max_results="10"
Maximum number of results to return. Defaults to 10.
offset_results="0"
Skip the first n results. This can be used with max_results to do
basic pagination. It's up to you whether you want to do pagination
based on min/max radius, or offsets.
See the Side Effects section for scratch variables set by this tag
with the last/least distant station's distance.
query="SELECT * FROM addresses WHERE (lat < {LATHIGH} AND lat > {LATLOW}) AND (lng < {LNGHIGH} AND lng > {LNGLOW})"
A SQL query that will return a list of geocoded addresses to search
for proximity to the origin. If your database does not have any GIS
features, such as MySQL, then this query should make use of
the {LATHIGH}, {LATLOW}, {LNGHIGH} and {LNGLOW} placeholders. These
are substituted for the coordinates of a bounding rectangle. All
these do is restrict the search space for the "proper" radius search
that comes later. It'll still work if you pass
"SELECT * FROM addresses", but performance will suffer on large
tables.
origin_query="SELECT * FROM zips WHERE zip = '{ZIP}'"
The proximity tag assumes you have a table called "zips" with three
columns: zip, lat, lng. If you pass a zip code to search from, then
the tag will look for it here. If, however, you have a different
table name and structure, you can override the whole SQL query here.
NOTE: If it doesn't find a match on the zip code that's initially
passed, the tag will search up and down numerically until it does.
Therefore, you need to keep the '{ZIP}' so the tag can substitue for
whatever zip code it's searching for at the time. DO NOT try to
pass the zip code this way. Use the zip="" parameter instead.
lat_field="latitude"
lng_field="longitude"
origin_lat_field="zip_lat"
origin_lng_field="zip_long"
The proximity tag also assumes the column names for latitude and
longitude are called "lat" and "lng" in your tables, respectively.
If this isn't the case, set their actual names with these parameters.
header="<p>The following were found within 10 miles of {CITY}, {STATE}</p>"
footer="<p>Distances calculated from approximate center of {ZIP}</p>"
An optional header and footer displayed only if results were found.
Any field returned by the origin_query can be addressed with
"{FIELDNAME}" template placemarkers.
noresults="<p>Sorry, there were no results for {ZIP} in {CITY}, {STATE}</p>"
Optional template to display a message when there are no results.
Optional configuration settings
The following variables can be set in your catalog's variable.txt
PROX_QUERY
PROX_ORIGIN_QUERY
PROX_LAT_FIELD
PROX_LNG_FIELD
PROX_ORIGIN_LAT_FIELD
PROX_ORIGIN_LNG_FIELD
These all correspond to the parameters of the same name, described
above.
Side effects
This tag will set the following temporary scratch variables before exit.
prox_furthest_dist
The distiance, in miles, that the furthest "station" was found and
included in the results. This may not be the furthest station in the
database, just the furthest that still came within max_radius.
prox_nearest_dist
Like the above, but the distance to the nearest found station.
NOTE: Both prox_furthest_dist and prox_nearest_dist are saved with
20-digit precision. This is so you can pass them as arguments to
min_/max_radius as accurate starting/ending points. You might want
to format these numbers before you display them as user info.
prox_display_count
The number of stations displayed in the results. Use this to find
out if there were fewer than max_results stations found.
EOD
UserTag proximity Routine <<EOR
use Math::Trig;
sub distance {
my ($lat11,$long1,$lat21,$long2) = @_;
my $lat1 = deg2rad($lat11);
my $lat2 = deg2rad($lat21);
my $deltalat = deg2rad($lat1)-deg2rad($lat2);
my $deltalong = deg2rad($long1)-deg2rad($long2);
my $dist = sprintf "%05.15f", 1.1515*60*rad2deg(acos (sin($lat1) *sin($lat2) + cos($lat1) * cos($lat2) * cos($deltalong)));
}
sub {
# Options gathering ----------------------------------------------
my ($zipcode,$opt,$tpl) = @_;
my $ref = ::database_exists_ref('zips');
$opt->{origin_query} ||= $Variable->{PROX_ORIGIN_QUERY} || "SELECT * FROM zips WHERE zip = '{ZIP}'";
$opt->{query} ||= $Variable->{PROX_QUERY} || "SELECT * FROM affiliate WHERE (lat < {LATHIGH} AND lat > {LATLOW}) AND (lng < {LNGHIGH} AND lng > {LNGLOW})";
$opt->{max_radius} ||= 10;
$opt->{max_results} ||= 10;
$opt->{max_zip_tries} ||= 10;
$opt->{lat_field} ||= $Variable->{PROX_LAT_FIELD} || 'lat';
$opt->{lng_field} ||= $Variable->{PROX_LNG_FIELD} || 'lng';
$opt->{origin_lat_field} ||= $Variable->{PROX_ORIGIN_LAT_FIELD} || 'lat';
$opt->{origin_lng_field} ||= $Variable->{PROX_ORIGIN_LNG_FIELD} || 'lng';
my $origin = {};
if ($opt->{origin}) {
($origin->{lat}, $origin->{lng}) = split ',',$opt->{origin};
}
# ------------------------------------------------------------- ##
# Find nearest zip-code centroid ---------------------------------
if (!$origin->{lat}) {
$zipcode =~ s/^\s+//;
$zipcode =~ s/\s+$//;
$zipcode = substr($zipcode,0,5);
my $higherzip = my $lowerzip = my $tryzip = $zipcode;
$higherzip = sprintf '%05d',++$higherzip;
# Find nearest recognized zip code
for (my $i = 0; $i < $opt->{max_zip_tries}; $i++) {
my $query = $opt->{origin_query};
$query =~ s/{ZIP}/$tryzip/;
$origin = $ref->query({ sql => $query, hashref => 'origin' });
last if $origin->[0]->{zip};
if ($tryzip eq $lowerzip) {
$tryzip = $higherzip;
$higherzip = sprintf '%05d',++$higherzip;
}
else {
$lowerzip = $tryzip = sprintf '%05d',--$lowerzip;
}
}
$origin = $origin->[0];
}
# ------------------------------------------------------------- ##
# Get a rough bounding rectangle. This is only so we can work with a
# smaller slice of the database, and deliberately errs on being too
# big. It is _NOT_ how we select the final results.
my ($latlow,$lathigh,$lnglow,$lnghigh) = (-180,180,-180,180);
$latlow = $origin->{lat} - ($opt->{max_radius} / 47.5);
$lathigh = $origin->{lat} + ($opt->{max_radius} / 47.5);
$lnglow = $origin->{lng} - ($opt->{max_radius} / 47.5);
$lnghigh = $origin->{lng} + ($opt->{max_radius} / 47.5);
my $query = $opt->{query};
$query =~ s/{LATLOW}/$latlow/g;
$query =~ s/{LATHIGH}/$lathigh/g;
$query =~ s/{LNGLOW}/$lnglow/g;
$query =~ s/{LNGHIGH}/$lnghigh/g;
$query =~ s/{ORIGLAT}/$origin->{lat}/g;
$query =~ s/{ORIGLNG}/$origin->{lng}/g;
$query =~ s/{MAX_RADIUS}/$opt->{max_radius}/g;
my $stations = $ref->query({ sql => $query, hashref => 'stations' });
# ------------------------------------------------------------- ##
# Fill a hash with all addresses, keyed by distance --------------
my %closest = ();
my $n;
foreach my $station (@{$stations}) {
$n++;
my $dist = &distance($station->{$opt->{lat_field}},$station->{$opt->{lng_field}},$origin->{$opt->{origin_lat_field}},$origin->{$opt->{origin_lng_field}});
$station->{distance} = sprintf "%5.1f",$dist;
my $dkey = sprintf "%03d",$n;
$closest{"$dist$dkey"} = $station;
}
# ------------------------------------------------------------- ##
# Select results -------------------------------------------------
$n = 0;
my ($nearest,$furthest) = ($opt->{max_radius},0);
my $out;
foreach my $dist (sort {$a <=> $b} keys %closest) {
next if $dist < $opt->{min_radius};
last if $dist > $opt->{max_radius};
last if $n > $opt->{max_results};
if (!$opt->{offset_results} or ($n > $opt->{offset_results})) {
$out .= $Tag->uc_attr_list($closest{$dist}, $tpl);
$nearest = $dist if $dist < $nearest;
$furthest = $dist;
}
$n++;
}
if (!$n) {
$Tag->tmp('prox_furthest_dist',0);
$Tag->tmp('prox_nearest_dist',0);
$Tag->tmp('prox_display_count',0);
if ($opt->{noresults}) {
return $Tag->uc_attr_list($origin, $opt->{noresults})
} else {
return;
}
}
$out = $Tag->uc_attr_list($origin, $opt->{header}) . $out if $opt->{header};
$out .= $Tag->uc_attr_list($origin, $opt->{footer}) if $opt->{footer};
# ------------------------------------------------------------- ##
# Set side-effects -----------------------------------------------
$Tag->tmp('prox_furthest_dist',$furthest);
$Tag->tmp('prox_nearest_dist',$nearest);
$Tag->tmp('prox_display_count',$n);
# ------------------------------------------------------------- ##
return $out;
}
EOR
--
Chris Wenham - Synesmedia, Inc.
http://www.synesmedia.com
516-620-4110 / 1-888-255-7573
Fax: 516-908-7824
More information about the interchange-users
mailing list