Geographic Database Search Interfaces and the Equatorial Cylindrical Equidistant Projection

Ross S. Swick and Kenneth W. Knowles
University of Colorado
Cooperative Institute for Research in Environmental Science
   National Snow and Ice Data Center
   Campus Box 449
Boulder, CO 80309-0449
 

DRAFT  DRAFT  DRAFT  DRAFT  DRAFT

Please send comments, questions, corrections, etc... to: swick@chukchi.colorado.edu

Abstract

In the first stages of development geographic database search interfaces are commonly implemented using only the Equatorial Cylindrical Equidistant (ECE) projection primarily because it is a terribly easy projection to use.  This ease of use allows the developers to make several simplifying assumptions that cause problems at later stages of the development process when other projections are added to the mix.  Because the (lat, lon) coordinates of the ECE projection are just scalar multiples of the (x, y) coordinates the two often get conflated, resulting in an interface that is completely dependent on the ECE projection.  Either because they don't recognize this dependence, or because they feel it is too late to change, the developers often seek convoluted solutions to the problems "caused" by the new projections as they arise or are noticed.  The more direct approach is to be explicit about what projection is being used in every part of the interface, recognize the (x, y) coordinates for what they are, and deal with the true nature of this rather difficult task.  Once all parts of the system are aware that projections other than the ECE exist the problems become much less intractable, though no less confusing.  Motives, methods, and strategies for explicitly dealing with the projection are discussed.

Introduction

As databases become more sophisticated, Graphical User Interfaces (GUI) become easier to implement, and the shear volume of data being made available becomes almost overwhelming, use of geographic keywords (e.g. "North Sea", "Kansas", and "Ross Ice Shelf") to search for geographic data is becoming less common. Increasingly GUI tools used to search geographic databases offer a way to select the geographic area itself. Unless the data are stored with reference to the same map the user uses to select their geographic area of interest some error is inevitable.  To mitigate the effects of this error it is important to first correctly identify the causes. 

Maps Background

When we speak of a map projection we are talking about some procedure or mathematical formula to transform a curved surface into a plane. A map projection is a one to one mapping (in the mathematical sense) of points on the curved surface to points on the plane. The curved surface is usually the surface of the Earth and the plane is what we call a "map." 1
For the purposes of this paper the important thing to keep in mind is that anytime a three dimensional object (e.g. the Earth) is represented as a two dimensional object (e.g. a map, or an image) the two dimensional object is a projection of some kind.

There is no one projection that is best for all purposes.  When projecting a map, image, or other data, there are certain tradeoffs.  Users of polar data may prefer a polar projection while users of tropical data may prefer an equatorial projection.  If the area of certain features is deemed important (e.g. comparing the area covered by sea ice to open water area) users may want an equal-area projection.  If the shape of certain features is deemed more important (e.g. for navigation) users may want a conformal projection.

Database GUI Background

Historically databases have been rather difficult to access.  Menu driven interfaces based on 4GL or QBF were marginally better than straight command line SQL but still not terribly user friendly and usually required a trained "operator".  Having an operator was a mixed blessing; while the operator could compensate for certain limitations in the database search tools the mere fact that the user of the data user had no direct access to the database limited usage.  Obtaining data was generally accomplished via the phone or the mails.  The operator would enter the search criteria and then serve as a post search filter to weed out erroneous results.  The whole process was time consuming and expensive with the end result being that access to the data was effectively restricted to the rich or the persistent.  Even user friendly X and windows interfaces often had to be run physically close to the database thus restricting access to the near.  The data providers were no happier about this than the users but such was the state of the technology.

The recent explosion of the web has made it possible for data providers to relieve some of their workload while also increasing access to the data.  Languages and Application Protocol Interfaces like HTML, CGI, perl, Java, and JDBC have made it possible to develop user friendly tools that can be used by virtually anyone to search and order the data from virtually anywhere in the world.  More users receiving more data is obviously the goal of any data provider, but the downside is that the new "operators" of the tools lack the training to compensate for any limitations the tools may have.  Consequently the tools themselves have to be more robust.

Because of their complex nature web based search and order tools are usually designed and implemented by a team.  At a minimum this team usually includes a Database Administrator and a GUI programmer implementing the "back end" (the server and database) and the "front end" (the interface itself or "client") of the interface respectively.  When dealing with geographical data a good interface will include some kind of spatial selection screen that allows the user to select their geographical area of interest.  Since it is often the case that no one on the design team has any particular expertise in the area of map projections, and the data are rarely in the same projection as the map used to describe the user's area of interest, this piece of the interface is rarely as good as it could be.

Developing the interface

In general database search and order tools designed for geographic data will include a geographic area selection screen or module. Many times this takes the form of text fields into which the user is expected to enter the lat/lon extremes of their area of interest. In more sophisticated tools a map is presented on which the user can draw a rectangle or other shape.2 Usually this map is an Equatorial Cylindrical Equidistant (ECE) projection so drawing a rectangle on it is functionally equivalent to entering lat/lon extremes into text fields labeled "North Lat.", "South Lat.", "West Lon.", and "East Lon".  Indeed these text fields often appear on the screen somewhere near the map (Figures 1-3).
 
Master Evironmental Library - Spatial Selection
Figure 1:  The Spatial tab of the Master Environmental Library java applet (http://www-mel.nrlmry.navy.mil)
 
IMSWWW - Geographic Coverage
Figure 2:  The Geographic Coverage section of v1.6 of the IMSWWW Gateway (http://harp.gsfc.nasa.gov/~imswww/pub/imswelcome/imswwwsites.html). This interface actually has several maps ("Coastlines", "Atlas", and "Elevation") all of which are in the ECE projection. Later releases (v1.7 and above) include Polar Stereographic projections and (v1.9 and above) Orthographic projections on which user defined rectangles are converted to corner points and sent to the servers as polygons instead of lat/lon extremes.  Most of the servers then convert the polygon to a set of lat/lon extremes, but at least they have options.
 
JEST - Spatial Specification
Figure 3:  The Java Earth Science Tool (JEST) Spatial Specification applet (http://epsun.gsfc.nasa.gov:3000/jest/cgi-bin/ClWbJtStart). This tool has not been released yet and polar projections will be included by the time it is released. Note, however, that polar projections are being added late in the development process - so the development history has still followed the same basic outline described below, just without any interim releases. 
 
The ECE projection is of little practical value but its use in GUIs is prevalent because the transformation functions from (x, y) to (lat, lon) and back are by far the easiest. The ECE projection is the only projection in which all the trigonometry disappears (things like cos(0) and sin(90)) and the transformation functions become straightforward and linear.  Using a left handed coordinate system (with the origin in the upper left corner, y positive downward, x positive to the right), which is the pixel coordinate system of the computer screen, the transformations are: This is all fine and good. The problem is that developers often do not understand the limitations of the ECE projection. While the simplicity of these transformations makes this projection attractive to programmers with limited cartographic experience the projection itself offers little utility apart from ease of use. The ECE projection is neither conformal nor equal-area so neither angles nor areas are proportional in this projection. And, while distances are proportional along the meridians and the equator they are not proportional along any diagonal nor any parallel other than the equator.
The projection originated probably with Eratosthenes (275?-195? B.C.), the scientist and geographer noted for his fairly accurate measure of the size of the Earth.  Claudius Ptolemy credited Marinus of Tyre with the invention about A.D. 100 stating that, while Marinus had previously evaluated existing projections, the latter had chosen "a manner of representing the distances which gives the worst results of all." 3

Sources of Error

Data are archived in a wide variety of formats from raw data to geolocated/geocorrected "scenes" to extensively processed data stored in "grids."  Most of these data are two dimensional images of a three dimensional Earth and consequently are "projected".  Even data that have not been remotely sensed (e.g., in-situ station data) are often converted from tabular format to gridded format as a visualization aid.

And just as the data formats vary greatly so do the projections the data are stored in.  The choice of projection is usually driven by the data.  For Global Circulation Models (GCMs) the ECE projection is often the projection of choice precisely because the transformations are so simple and there is enough to worry about just modeling a chaotic system in the first place.  For sea ice data, however, a polar projection is more appropriate.  And some satellite data is archived in a "satellite view" projection that approximates what the Earth really looks like from the satellite.

In other cases the choice of projection is driven by the user.  For example; a scientist studying a specific weather anomaly may want satellite data in an ECE projection because that is the projection the GCM is in.  Or the captain of an icebreaker may want sea ice data in a Mercator projection to overlay on his navigational charts. And once these data are used for their intended purpose they are archived and made available for unintended purposes.

The point, of course, is that data are stored in many different projections while the user is restricted to just the ECE.  If the data doesn't happen to be in the ECE projection the search algorithm that compares the user selected area of interest and the data coverage area is handicapped. The algorithm must look for overlap between these two areas with no common frame of reference. The easiest solution, from a programming perspective, is to convert the coverage information for the data into a orthonormal rectangle (lat/lon extremes) on an ECE projection and do the comparison there.
 

Types of Error

Programmers are attracted to the ECE projections because the transformations are so simple. But because of this simplicity the development team often forgets the distinction between (x, y) and (lat, lon).  Any orthonormal rectangle drawn on the ECE projection can be specified using only the lat/lon extremes (four numbers) instead of the corner points (eight numbers), and a user defined (orthonormal) rectangle on an ECE projection is always equivalent to a set of lat/lon extremes.  This leads to the concept of the lat/lon bounding box as the method for describing the user's area of interest.  And since it is the lat/lon extremes of the user defined area of interest that are being compared to the data coverage area it is commonly the lat/lon extremes of the data that get stored in the database as coverage information.

The data, however, are rarely well behaved.  Except for data that have been specifically processed to be in the ECE projection the actual coverage of the data will rarely be the same as the area described by the lat/lon extremes.  Similarly the user's area of interest, if drawn on a projection other than the ECE projection, will rarely be the same as the area described by the lat/lon extremes.

To make matters worse it is often the case that the lat/lon "extremes" of the bounding box are determined by checking just the corner points of the target area. While the extremal lats and extremal lons are guaranteed to be at a corner point of any orthonormal rectangle drawn on an ECE projection there is no such guarantee for any other projection.4  On any other projection the extremes could be at a corner point, along any edge, or even within the rectangle.  Thus if the only the corner points are checked the resulting lat/lon bounding box could easily end up being completely inaccurate when projections other than the ECE projection are used.  This means that if the user's area of interest intersects a portion of the data coverage area that is not part of the inaccurate lat/lon bounding box (or vice-verse) the data will not be returned.  This is an exclusion error or "false negative" (Figure 4).

On the other hand, if the lat/lon bounding box is truly extreme, it will always be as large or larger than the area it delimits.  This means that if the user's area of interest intersects a portion of the lat/lon bounding box that the data do not actually cover (or vice-verse) the data will be returned anyway.  This is an inclusion error or "false positive" (Figure 5).

False Negatives

False negatives are considered the worst of the two types of error.  The reason being that if the user is presented with too much data they can weed out what they don't want on their own, but if they are not presented with the data in the first place their options are limited.  False negatives are also usually the first type of error the developers detect because the effects can be rather dramatic.  Fortunately eliminating false negatives is relatively easy, but unfortunately eliminating false negatives exacerbates the problem of false positives.
 
North Polar Stereographic - orthonormal rectangleEquatorial Cylindrical Equidistant - orthonormal lat/lon bounding box.
Figure 4:  An orthonormal rectangle (bright green) on a North Polar Stereographic projection (left) is neither orthonormal nor a rectangle on an ECE projection (right).  If the lat/lon extremes of the bounding box (dark green) are determined by checking just the corner points of the target area it can easily be completely inaccurate and not a "bounding" box at all.  This can result in false negatives because any data that are inside the area outlined in bright green but outside the area outlined in dark green (e.g., the polar region) will not be returned by the search algorithm.  On the other hand data that are inside the area outlined in dark green but outside the area outlined in bright green (e.g., The U.K. and northern Europe) will be returned as a false positive.
 
To eliminate false negatives the developers merely have to be sure their lat/lon bounding box is truly extreme.  This means that in projections other than the ECE they have to check more than just the corner points of the area.  On the data side the actual coverage of the data is usually fairly well known.  For example in the National Snow and Ice Data Center's Northern Hemisphere EASE-Grid Weekly Snow Cover and Sea Ice Extent data set every granule of the data set is in the same grid and has the same coverage (Lat: {0, 90}, Lon: {-180, 180}) so making sure the lat/lon bounding box is extreme only has to be done once for the entire data set.

For other, less processed, data sets the coverage may not be so well known.  For example satellite "scenes" are often defined by the latitude extremes at nadir and while the coverage is roughly rectangular on the surface of the Earth it is neither orthonormal nor a rectangle on an ECE projection and the actual lat extremes are not the lat extremes at nadir.  Aircraft, ship, and submarine data is even worse since it tends to be a long, somewhat erratic,5 track with no predictable coverage area.  In those cases the data provider does have to do some work to maximize the lat/lon bounding box.  But since this is happening while the data is being ingested into the database, and there is no user waiting for their search results in the interim, the developers have the luxury of being able to check every point in the data.  This method is time consuming and CPU intensive but since it can be done overnight it isn't considered a terrible burden.

It isn't until the developers start to add new projections to the interface, and start trying to convert a user defined rectangle on one of the new projections into a lat/lon bounding box on an ECE projection, while the user is waiting for their search results, that speed becomes an issue.  At this point the developers realize that while the extremes aren't necessarily at the corners in other projections there are still some guarantees.  For the commonly used projections the lat/lon extremes will always be on an edge unless the extreme in question is the international date line or one of the poles.6   This realization greatly reduces the number of points that have to be checked and false negatives can be completely eliminated by maximizing the lat/lon bounding box for both areas.
 

 False Positives

As mentioned above, eliminating false negatives by maximizing the lat/lon bounding box exacerbates the problem of false positives.  While the lat/lon bounding box grows larger the area it bounds remains the same, which increases the amount of "empty space", which increases the likelihood of a false positive. Because there are two different areas involved for which a lat/lon bounding box is constructed (the users area of interest and the data coverage area) there are two potential sources of empty space producing false positives.  There is no practical way to completely eliminate false positives but there are ways to minimize the effects.

Since a user defined orthonormal rectangle on an ECE projection is identical to the lat/lon bounding box there is no empty space.  When other projections begin to be added to the interface the user defined rectangle remains orthonormal relative to the window coordinates (x, y)  but is no longer orthonormal relative to the map coordinates (lat, lon).  This means conversion to a lat/lon bounding box creates a search area that is larger, and in some case much larger, than the area specified by the user (Figure 5).
 

Icelandic Low on a North Polar StereographicIcelandic Low on an ECE
Figure 5:  A user interested in cyclonic activity in the Icelandic Low and surrounding regions can get back results that are over 75 percent false positives because their area of interest (bright blue) drawn on a North Polar Stereographic projection (left) is converted to a lat/lon bounding box (dark blue) on an ECE projection (right) that is over four times larger.
 
What generally happens, as new projections are added, is the interface is modified to support the lat/lon bounding box paradigm. Initially areas drawn on the new projections will quietly be converted to lat/lon extremes before being sent to the server. Once the realization sets in that this area can be quite different from the area the user actually drew the interface is often modified to accurately draw the area that will be sent using rhumb-lines (lines of constant bearing) instead of straight lines.  Depending on the projection being used the user defined "rectangles" will then have curved edges, or odd angles, which makes it difficult for the user to define their area of interest until they get used to this new definition of a "rectangle" (Figure 6).
 
ECE RectangleOrthographic Rectangle
 Polar Stereographic RectangleMollweide Rectangle 
Figure 6:  A rectangle (bright blue) drawn on an ECE projection (upper left) and the same "rectangle" on an Orthographic projection (upper right) a North Polar Stereographic projection (lower left) and a Mollweide projection (lower right).
 
Having gone through this progression over a number of months or years the lat/lon bounding box paradigm becomes second nature to the development team and they often have difficulty understanding the user's confusion. After all, in the rather moderate examples shown above the "rectangles" are almost rectangular. If the objection is just one of semantics then it's a simple matter to go through the documentation and change all occurrences of the word "rectangle" to the phrase "lat/lon bounding box" And if one objects to the word "box" that too can be changed.

Unfortunately the objection isn't merely semantic; the main problem is usability. The user doesn't necessarily know, or need to know, what a lat/lon bounding box is, so the fact that it's a different shape on each projection is confusing. Indeed, for most projections the lat/lon bounding box is a different shape depending on where it is on the map. This is something the users could get used to but many won't take the time.

There is also the matter of programming effort.  In most graphical languages it is a relatively simple matter to draw an orthonormal rectangle, but to draw these odd shaped lat/lon bounding boxes requires some extra effort. Ironically the reason the lat/lon bounding box paradigm is adopted in the first place is because it makes the development effort simpler. Now, however, the developers find themselves putting in a lot of extra effort in order to maintain that simplicity without actually solving the problem.

While the developers are able to dictate that the user's area of interest be a lat/lon bounding box, they cannot enforce a similar restriction on the data coverage area. So when the data coverage area is converted to a lat/lon bounding box there will be empty space and, in some cases, quite a lot of it. Consider, for example, a polar orbiting satellite for which the data have been processed into eighth-orbit scenes. The latitude extremes for an eighth orbit that includes the north pole will be (about) 43 and 90. And because the scene includes the pole the longitude extremes will be -180 and 180. Consequently the coverage area, when converted to a lat/lon bounding box, will be Lat: {43, 90}, Lon: {-180, 180} or simply "everything north of 43 degrees". Even a sensor with a fairly wide swath will cover at most a fourth of this area which means a user looking for data in the area of the Great Lakes will get back data that covers northern Russia (Figure 7).

 

Russian Eighth
Figure 7:  A user searching for data covering the Great Lakes (bright blue) gets back an eighth orbit covering northern Russia (bright red) because the coverage information for the eighth orbit is converted to lat/lon extremes (dark red).  The full orbit is represented by the three black lines and as you can see the descending pass of the same orbit does cover the Great Lakes.
 
The example shown in figure 7 is an eighth-orbit of the OLS sensor on board the DMSP F14 satellite. This satellite completes 14 orbits per day, resulting in 28 eighth-orbit scenes which touch the north pole only 2-4 of which cover the Great Lakes region. However every eighth-orbit that touches the north pole will have the same lat/lon bounding box. So for this search 85-92 percent of the results will be false positives.  While it's true that the user can, at this point, eliminate the false positives on their own they are much more likely to just give up once they realize the search results are so poor.  The end result is a frustrated user who still doesn't have the data they want.
 
A common objection by those who have succumbed to the lat/lon bounding box paradigm is that (lat, lon) coordinates are "on the Earth" so by using lat/lon extremes to describe the two areas the developers feel they are being more general; using the Earth's coordinate system means they are not bound to any specific projection. This might be true if the area comparisons were then done on a sphere but usually the comparisons are done using this set of boolean expressions on the lat/lon extremes of the two areas (Area Of Interest vs. Data Coverage Area):
if(AOISouth > DCANorth) eliminate
else if(AOINorth < DCASouth) eliminate
else if(AOIWest > DCAEast) eliminate
else if(AOIEast < DCAWest) eliminate
else accept
While attractive for its simplicity this only works because the ECE projection is cartesian.  Moreover, by referring to these numbers as "North", "South", "East", and "West" the developers mask what's really going on and confuse themselves. These are actually (x, y) coordinates and only appear to be (lat, lon) coordinates because the ECE projection is cartesian.  The real comparison being carried out is a comparison of x/y extremes.
if(AOIMinimumY > DCAMaximumY) eliminate
else if(AOIMaximumY < DCAMinimumY) eliminate
else if(AOIMinimumX > DCAMaximumX) eliminate
else if(AOIMaximumX < DCAMinimumX) eliminate
else accept

Possible solutions

Developers are initially drawn to the ECE projection because the transformation functions are so simple.  And they are initially drawn to the lat/lon bounding box as the preferred method of describing a geographic area because it makes comparing two areas relatively quick and easy.  Both the simplicity of the transformations and the ease of the area comparisons are a result of the fact that the ECE projection is cartesian.  The insidious thing about this method is it works fairly well for much of the data in the tropics and the mid-latitudes. Even in the eighth orbit example given above, where the search returns 28 results that touch the pole for every 2-4 valid ones, the search eliminates some 84 eighth orbit scenes that are completely invalid.  So 75 percent of the data gets eliminated relatively quickly.  Consequently solving the problem probably doesn't require changing the lat/lon bounding box methodology but augmenting it.

The really bad cases are at, or near, the poles which, unless they work someplace like the National Snow and Ice Data Center, the developers probably aren't interested in.  Indeed these problems often go unnoticed until the data provider starts working with satellite data from a polar orbiting satellite.  At that point the data provider is usually only interested in the tropical data gathered by the satellite but feels they may as well archive everything.  But when things start to go awry the solution is often to simply cut off the coverage at some arbitrary latitude (like +-60).  This is all fine and good if the data provider's user community doesn't care.  And while the problems outlined above are still present they aren't so severe in the lower latitudes.

Currently, however, there is a trend toward attempting to build a single interface to search geographic data archived by multiple data providers.7  The idea is to make access to the data easier for everybody while also encouraging new interdisciplinary research by making a wider variety of data available to researchers who may find an interesting tie-in.  It's a laudable goal but it does mean the user community is greatly expanded and at least part of that community is likely to be interested in polar data.

These larger efforts tend to be guided by the experience of the people involved who have developed similar interfaces in the past, and the vast majority of these people have experience using the ECE projection and the lat/lon bounding box.  But because the user community has expanded to include the polar community the lat/lon bounding box paradigm is no longer "good enough" and the problems outlined above have to be addressed.

Unfortunately theses larger efforts are even more motivated to stay with the lat/lon bounding box paradigm.  Apart from the ease of the programming involved the lat/lon bounding box paradigm has the unique advantage of  not requiring the server to know anything about map projections whatsoever.  Since these larger efforts tend to focus on building a single client to interface with multiple pre-existing servers and databases the costs associated with modifying the server(s) are multiplied. It would be nice if  these pre-existing servers and databases could remain as they are so these multiple costs could be avoided.  While in certain cases this may be possible via a bit of creative programming on the client side; for the reason's outlined above it is not generally a good idea.

In breaking from the lat/lon bounding box paradigm, while still trying to avoid modifying the servers, exotic solutions are often sought. Methods commonly explored include: quad trees, tiling, orbit models, and point by point comparisons. While these are all good ideas, and in some cases may be warranted, the only one that actually addresses the root of the problem is point by point comparison and that method is impractical.

There are really two closely related questions here which often get conflated:

  1. How are the two areas described?
  2. How are the two areas compared?
Orbit models will only work for orbit data and even then they are data set, or at least satellite, specific; so many models will be required.  Quad tree and tiling schemes only address the second question in that they are simply faster methods of comparing the two areas.  But if the areas are poorly described in the first place (e.g. described on an ECE projection no matter what projection they are actually on) they will simply be faster methods of producing the same poor results.  And point by point comparisons are completely impractical in that one can't actually send every point in the user's area of interest to the server, and one can't actually store every point covered by every data granule (smallest archived piece of data) in the database.

The more moderate solution is to reexamine the paradigm itself.  What happens when the area comparisons are done using the first set of boolean expressions noted above is that both areas (the users area of interest and the data coverage area) are tacitly converted to the ECE projection.  Since both lat/lon bounding boxes have been maximized to prevent false negatives this results in two sources of error producing false positives: error caused by converting the user's area of interest to the ECE projection and error caused by converting the data coverage area to the ECE projection.  At least one of these sources of error can be reduced significantly, or entirely eliminated, by not doing the conversion.
 
The ECE projection is attractive because the (lat, lon) coordinates are cartesian.  But in every projection the (x, y) coordinates are cartesian, so the second set of boolean expressions noted above will work in any projection.  This is a fundamental shift in the paradigm that has important ramifications; instead of the lat/lon bounding box consider using the x/y bounding box.  In the ECE projection the two bounding boxes are identical but in on every other projection they are different.  By shifting the paradigm to the x/y bounding box the developer becomes free to do the area comparisons on any projection whatsoever.  This means that instead of converting both areas to the ECE projection the developers can convert just one area to the projection the other area is already on.

At this point the developers have three options:

  1. Convert the data coverage area to the user's preferred projection and do the area comparisons there.
  2. Convert the user's area of interest to the projection the data is in and do the area comparisons there.
  3. Do something else.

Costs

Before examining these options it is important to be clear about what is involved.  Converting either area to the projection the other area is in can be costly.  Finding the x/y extremes of an orthonormal rectangle on a given projection is relatively easy since the corner points are just {(maxX, maxY), (minX, maxY), (minX, minY), (maxX, minY)}.  But finding the x/y extremes for the same area in another projection is quite costly since the area becomes warped in the new projection.  Consequently one has to check every point along every edge of the area by converting (x, y) in the first projection to (lat, lon), converting that (lat, lon) coordinate to (x, y) in the second projection, and adjusting the x/y extremes appropriately.  For even a moderately sized area this can involve several hundred computations per data granule and if the user requests many granules per data set the performance of the algorithm can end up being rather poor.

"Poor", of course, is a relative term.  The extra computations happen on the server side so the developers control what machine they run on.  Typically data providers will put the database and the server on a fairly fast unix box like an SGI Challenge, Sun SparcStation, or a DEC Alpha, and on that class of computer initializing the map and performing five hundred (x, y) to (lat, lon) computations typically takes less than two hundredths of a second.  So if the search has to check five hundred granules it will take ten seconds longer.  Whether this additional time is a major performance hit depends on how long the rest of the search algorithm typically takes.

One factor that helps speed things up a bit is that we haven't eliminated the original lat/lon bounding box methodology which is fairly cheap and fairly effective much of the time.  In cases where the lat/lon bounding box is effective the only data coverage areas that get checked by this new portion of the search algorithm are areas for data that will be accepted, so the additional cost is minimized.  And even in cases where the lat/lon bounding box is inefficient it still manages to eliminate a large number of candidates, so the expensive part of the algorithm isn't wasting a lot of time on ridiculously mismatched data.

Still, using the x/y bounding box will slow down the search.  Since there are cases where the lat/lon bounding box methodology is completely inadequate some kind of secondary filter is necessary.  But since there are also cases where the lat/lon bounding box methodology is "good enough" the second, slower, filter is not always necessary.  It might be prudent, therefore, to let the user decide if they need the second filter or not.  A pair of toggles on the interface labeled "fast" (default) and "accurate" should suffice.  Users who are interested in tropical data, an area in which the lat/lon bounding box works quite well, can use the "fast" option and get their results quickly.  But when users interested in polar data get back incredibly poor results using the "fast" option they can activate the secondary filter by clicking the "accurate" toggle and go read the paper while the search is performed.

Optionally one might let the user decide how "fast" or "accurate" they want their search to be by changing the pair of toggles to a slider to with those labels on either end.  Determining the extremes of the bounding box would be faster if only every other point were checked.  But checking every other point is equivalent to checking every point on a map at half the resolution.  So the slider could simply determine the resolution of the map(s) used to compare the two areas.  The fastest searches only check the corner points while the most accurate searches check every point along every edge on a map at the highest practical resolution.  With a slider incorporated into the interface each user can find their own happy medium.

There is also some programming cost involved.  The interface itself has only to be modified slightly to send the server some additional information.  Instead of just the lat/lon extremes of the user's area of interest (four numbers) the server will now need to know the corner points (eight numbers) and some information about the projection the user chose (eight more numbers and a string).  The interface already knows all this information and just needs to be tweaked a bit to send it to the server.

On the server side the programming effort required is a bit more than mere "tweaking".  Previously the server didn't need to know anything about maps because the information it got from both the interface and the database was already converted to the ECE projection.  Now, however, the server has to be able to do the area comparisons on a variety of projections so it needs to be able to do the conversions itself.   Consequently this approach requires incorporating all the map projections into the server and, as noted above, there may be multiple servers involved.  Fortunately, however, incorporating the map projections into the server(s) is not difficult.

While the transformation functions can be quite complex the code itself has already been written, and only needs to be integrated into the server.  Indeed, if the interface is already able to handle multiple projections, that transformation code is already in the interface and can be copied over to the server verbatim.  Otherwise there are a number of map transformation libraries available in a variety of languages.8  Consequently the only new code required in the server is the code that tells the area comparison algorithm how to perform this second step.  The exact nature of that second step in the area comparison algorithm remains to be explored.

Converting the DCA to the AOI

The main advantages of doing the area comparisons on the projection the user chose, rather than the projection the data are in, are twofold.  Firstly, there is the matter of preserving the user's area of interest.  Because it is not converted it remains exactly as the user specified it.  Since  the developers have little control over this part of the search this advantage is actually more important than it sounds.  Because the user's area of interest is unchanged rectangles remain rectangles and the user can draw any rectangle they want without adversely affecting the search (e.g.: a rectangle that includes the pole, or a long thin rectangle, or a long thin rectangle that includes the pole, etc.).

Secondly, the user chose that projection for some reason.  Possibly that choice was made just because it was the default projection, but savvy users will likely choose a projection based on the type of data they are interested in.  Consequently the data they are interested in is likely to be in the same projection or a similar projection.  For example a user interested in satellite data covering the Icelandic Low may choose a North Polar Stereographic projection while the data itself may be in an Orthographic projection centered at or near the north pole.  The difference between the two is slight which means the conversions will introduce only small amounts of error.

As a further example consider the eighth orbit scene over northern Russia discussed above.  Using only the lat/lon bounding box that scene got through as a false positive for a search on the Great Lakes region.  But if the user draws their area of interest on a more appropriate projection, and the area comparisons are done on that projection, the Russian eighth orbit doesn't pass (Figure 8).
 

Russian Eighth on a North Polar AzimuthalRussian Eighth on an Orthographic
Figure 8:  On an Azimuthal projection (left) centered on (90, -90) the user's area of interest (bright blue) and the x/y bounding box (dark red) for an eighth orbit scene covering northern Russia (bright red) aren't even close.  Similarly, on an Orthographic projection (right) centered near the users area of interest, the two areas are quite a distance apart.
 
The disadvantages have primarily to do with what happens when the user's choice of projection is poor.  For example if the user chooses to use the ECE projection, just because it's the default, the bad thing could easily happen.  It has already been established that on the ECE projection the lat/lon bounding box and the x/y bounding box are the same box.  So for the eighth orbit example given above (Figure 7) where the data coverage area was converted to lat/lon extremes on the ECE projection, converting the data coverage area to x/y extremes on the ECE projection will describe exactly the same area and exactly the same incredibly poor search results will be returned.  That is, a user searching for satellite data covering the Great Lakes, and using the ECE projection to specify their area of interest, will still get back an eighth orbit covering northern Russia and extending to the North Pole because the x/y bounding box for that eighth orbit, on the ECE projection, intersects their area of interest.

One might chalk this up to user naiveté.  After all, had they used a polar projection the results would have been much better, so they shouldn't be using the ECE projection to search for polar data..  That might be convincing except for three things: First, the Great Lakes aren't usually considered "polar".  Second, the fact that the user gets different results depending on what projection they use is itself a problem.  The user isn't likely to understand why this happens and is more likely to just assume the search algorithm is unstable and untrustworthy and consequently stop using it.  And third, this is a graphical user interface - it is designed for the user.  The developers job is to make it work, not to make excuses.

The more constructive approach is to try to provide some guidance.  For data that touches, or even comes close to, the pole a polar projection is a better bet and once the user's area of interest crosses 45 degrees latitude they are, sometimes unknowingly, in the polar region.  The documentation could and should mention that fact but unfortunately, in the age of user friendly interfaces, users rarely read the documentation.  So some visual clue that the user is straying into dangerous territory might be in order.  For example one might color the map green, yellow, and red to indicate the areas where the search algorithm will perform relatively well, moderately well, or not so well for that projection.  That, or one might just chuck the whole thing and go with option two.

Converting the AOI to the DCA

The main advantages of doing the area comparisons on the projection the data are in, rather than the projection the user chose, are twofold.  Firstly, there may be an opportunity to improve the performance of the search algorithm.  It is often the case that an entire data set is in the same projection and only the coverage area of the individual granules varies.  Indeed, in some case the coverage area doesn't vary either.  In those cases the conversion of the area of interest into an x/y bounding box on the new projection would only have to be done once for an entire data set.

Secondly, even if the user is capricious about their choice of projection the data producers and data providers usually are not.  Generally the data are in the projection they're in because that projection is, in some way, most appropriate.  As mentioned above the choice of projection is sometimes driven by the users, but usually it is driven by the data.  Consequently the data coverage area is usually most well behaved (compact and tidy) in the data's "native" projection.

The main disadvantage is the data can't be relied on to fit nicely (orthonormally) even in the most appropriate projection.  In an Orthographic projection centered on the center of the data (Figure 9) the data coverage area for the Russian eighth orbit is indeed nearly a nice rectangle.  Unfortunately it's tilted because the y-axis of the projection is along the longitude of the center point of the data instead of along the satellite track.  Consequently an x/y bounding box has to be constructed even for the unconverted data coverage area.  If this is done during the search it will double the number of computations involved, but it could be done just once when the data are ingested into the database and the results stored in the database for the search algorithm to retrieve.

Either way the x/y bounding box will be larger than the data coverage area which increases the amount of "empty space" and the chances of a false positive.  Note, however, that in spite of this the Russian eighth orbit is still rejected by this second part of the algorithm.  Unfortunately there is still a rather large area the user might have been interested in for which this eighth orbit would have made it through as a false positive.  In other words the results will be better, but still not great.
 

Russian eighth on an Orthographic projectionRussian eighth on a North Polar Stereographic projection
Figure 9:  On an Orthographic projection centered at the center of the data (left) the eighth orbit scene over northern Russia (bright red) doesn't fit nicely, so an x/y bounding box (dark red) is still needed.  In spite of the excess empty space the x/y bounding box (dark blue) for the user's area of interest (bright blue) still doesn't overlap.  If the map is rotated by changing the center longitude to the longitude that most nearly parallels the orbit track (right) the data coverage area is more nearly orthonormal and much of the excess empty space in the x/y bounding box is eliminated thereby minimizing the chances of a false positive..
 
On the other hand, the data can be relied on to fit nicely (orthonormally) in the most appropriate projection if  "the most appropriate projection" is defined as "the projection in which the data fits nicely".  For the Russian eighth orbit example the data fit rather nicely on an Azimuthal projection (Figure 8) centered on (90, -90).  Alternatively it also fits rather nicely on an Orthographic projection if the center is adjusted slightly (Figure 9).
 

Doing Something Else

Another possibility is the developers could just "cheat" to make lat/lon bounding box methodology work a bit better.  After all, the only reason the lat/lon bounding box for the Russian eighth orbit scene (Figure 7) is so wide on the ECE projection is it includes the pole on one edge of the data.  One might just declare that one point an "outlier" and adjust the bounding box accordingly.  With that adjustment the longitude range becomes approximately {-10, 170} and the data do not get returned as a false positive when the user's area of interest covers the Great Lakes.

This is actually not such a bad idea in this particular case.  Technically the new lat/lon bounding box no longer completely "bounds" the data coverage area which re-introduces the possibility of a false negative if the user's area of interest includes any of the points (90, {170, 180}) or (90, {-180, -10}) and no other points in the data coverage area. But this is a rather remote possibility.  And since the user is rather unlikely to be interested in just that tiny strip of data they would probably wouldn't want it even if it were returned.  So the benefits, in this case, far outweigh the drawbacks.

But that's just in this case.  This "cheat" wouldn't work so well if the pole was deeper inside the data coverage area instead of just on the edge.  For example, consider a data set containing quarter orbits centered on the pole (Figure 10).  A user interested in the Great Lakes would get back every quarter orbit centered on the North Pole whether it covered their area or not.  So while in specific cases some "intelligent cheating" may significantly enhance the search algorithm, a more general method is still needed.
 

Quarter orbit centered on the North Pole
Figure 10:  A user searching for data covering the Great Lakes (bright blue) gets back a quarter orbit that doesn't include the Great Lakes (bright red) because the coverage information for the quarter orbit is converted to lat/lon extremes (dark red).  In this case there is no obvious way to "cheat" that doesn't significantly increase the risk of  false negatives.
 
The conversion from one projection to another causes false positives, but it will cause different false positives on different projections.  Most geographic area search algorithms already use the lat/lon bounding box methodology to do the area comparisons on the ECE projection.  The new part of the algorithm just adds a second check on some other projection using the x/y bounding box.  Perhaps it's not really necessary to key that second projection to either the user's area of interest or the data coverage area.  Perhaps the important thing is just that it be a different projection so false positives that pass the ECE part of the algorithm at least stand a chance of being eliminated during the second step.

For example one might key the second projection to the user's chosen projection unless the user chose the ECE.  Since the first step uses the lat/lon bounding box methodology to do the area comparison on the ECE projection, and the x/y bounding box on that projection is the exact same box, there's no benefit in doing it again.  So if the user chooses the ECE projection, and wants an "accurate" search, the second part of the algorithm could just pick some other projection.  This could be the projection most appropriate for the data or just an orthographic centered on their area of interest.

Alternately one might suspect the user chose the ECE projection not because it was the default, but because it is one the few projections that shows the entire Earth instead of just one hemisphere.  One might even check to see if the users area will fit on a hemispheric map or not.  If not then a "world view" projection is still needed for the second step of the area comparison algorithm, but it should be a "world view" projection that is quite different from the ECE projection.  A Sinusoidal or Mollweide projection should suffice.

 

Russian eighth on a Sinusoidal projection centered at (0, 0)Russian eighth on a Mollweide projection centered at (0, 0) 
Russian eighth on a Sinusoidal projection centered at (0, 87)Russian eighth on a Mollweide projection centered on (0, 87) 
Figure 11:  The Sinusoidal (left) and Mollweide (right) projections cover both hemispheres.  Here the same eighth orbit covering northern Russia (bright red) is illustrated along with its x/y bounding box (dark red), the user's area of interest over the Great Lakes (bright blue), and its x/y bounding box (dark blue).  In both cases there is less empty space in the x/y bounding box for the data coverage area, but more empty space in the x/y bounding box for the user's area of interest, when the projection is centered on the center longitude of the data (0, 68) (bottom) instead of (0, 0) (top) because the projections are less distorted nearer the center.
 
The main point is that results that pass through as false positives when tested on one projection may not pass on a second projection.  Similarly results that still pass through as false positives on a second projection may not pass on a third or fourth projection.  Another option, then, is to pre-compute the x/y bounding boxes for data granule on a variety of different projections, store that information in the database, and do the area comparison multiple times.  Pre-computing the bounding boxes helps speed up the search algorithm, but it also increases the size of the database and at some point the law of diminishing returns kicks in, so the developers have to decide where their resources should be allocated.
 

One Common Variant

One variation on the lat/lon bounding box strategy deserves a bit more discussion because it is often adopted as a "solution" to many of the problems discussed in this paper.  Realizing that lat/lon extremes describing an orthonormal rectangle on an ECE projection can sometimes cause excessive false positives developers will start using the (lat, lon) corner points to describe a polygon. Using a polygon instead of an orthonormal rectangle produces a bounding area with much less "empty space" and consequently reduces the number of false positives returned by the search.  Implemented properly this could be step in the right direction but it doesn't go far enough.  Implemented improperly this could be a disaster.

The polygon strategy is usually implemented as a (lat, lon) polygon as a way to avoid having to be explicit about the projection.  But by neglecting the projection the developers perpetuate their dependence on the ECE projection and the problems caused by that dependence remain.

It has already been noted that using just the corner points (of either area) to determine the lat/lon extremes of the bounding box can have disastrous results.  Similarly using just the corner points of the area to determine the (lat, lon) corner points of a bounding polygon can end up describing a completely erroneous area (Figure 12).  Since the corner points of the bounding polygon cannot just be the corner points of the area some other set of points must be used.  But unlike the lat/lon extremes of the bounding box there is no clear way to decide what points to use as corner points of the bounding polygon.
 

Corner point quadralateral - eighth orbitCorner point quadralateral - quarter orbit
Figure 12:  Using only the corner points of the data to specify a quadrilateral on the ECE projection is, almost, sufficient for the eighth orbit example (left).  But for the quarter orbit example (right) the corner points alone describe a quadrilateral on an ECE projection that is completely inadequate as a representation of the data coverage area.
 
The easiest way to decide is to just use them all. In other words use every point on every edge of the target area.  Depending on the resolution of the map this would be hundreds, or even thousands, of points.  This would make the bounding polygon incredibly accurate but it isn't terribly practical to send hundreds of points from the client to the server or to store hundreds of points in the database for each and every data granule.  It might be practical, however, to create a bounding octagon using the corner points of the target area and the midpoints of every edge.   This produces somewhat better results for the examples discussed so far but still not terribly good results.   For the quarter orbit example the coverage of the octagon is especially poor in that it doesn't cover the pole, which is nearly the center of the area, and isn't even an octagon. Instead of an octagon one ends up with two triangles and a quadrilateral because some of the "sides" of the octagon cross (Figure 13).  To compensate for this one might produce a decagon which would take care of the pole, but still not cover the entire area, which may lead to false negatives.  So one might then produce a dodecagon to make sure the international dateline is covered.  And this will work for this example.

In general one has to make sure the "bounding" polygon is maximized so it truly is bounding. To do this one has to compare the area covered by the candidate polygon to the area it is meant to bound and make sure the latter is enclosed by the former. In other words one has to compare the two areas in order to determine if one of them is a good enough representation of the other in order to compare it to a third area which has similarly been determined to be a good enough representation of yet a fourth area.  This is absurd.
 

Quarter orbit - attempted octagonQuarter orbit - decagon 
Quarter orbit - dodecagonCorner point quadralateral - North Polar Stereographic 
Figure 13:  For the quarter orbit example an attempt to construct an octagon (upper left) using the corner points and the midpoints of every side results in weirdness.  A decagon (upper right) specially constructed to include the pole still misses some of the data coverage area.  A dodecagon (lower left) constructed to include the international dateline covers the entire area but is concave, which is awkward.  Meanwhile a simple quadrilateral on a North Polar Stereographic projection (lower right) is convex, which is nice, and covers the entire area with little empty space, which is also nice.
 
As mentioned above the bounding polygon strategy is a step in the right direction but it doesn't go far enough.  By failing to be explicit about the projection the strategy is still tacitly relying on the ECE projection which causes problems.  Being explicit about the projection, however, makes the bounding polygon strategy less problematic because it is now an (x, y) bounding polygon.  Free to use any projection whatsoever the developers can pick the projection on which the target area is most compact and tidy which results in a bounding polygon that is easier to define and easier to manipulate.  For the users area of interest the obvious choice is simply the projection the user chose and already drew a nice polygon on.  For the data coverage area the choice may not be so obvious but the best choice is probably the projection the data are already in.
 

Polygon Comparisons

Of course when using bounding polygons instead of bounding boxes the area comparisons can no longer be accomplished via the simple boolean expressions used for the bounding boxes.  Moreover the bounding polygons for the user's area of interest and the data coverage area are likely to be in different projections.  Well developed methods exist for checking if two polygons overlap but those methods only work if the two polygons are in the same coordinate system.  Finding a common projection on which to do the area comparisons could be a problem since the same (lat, lon) corner points on a different projection can define a much different polygon.

Fortunately this is not much of a problem.  One way to handle the area comparisons is to use every point along every edge of the area to be converted as a corner point in the new projection.  This will result in a new polygon in the new projection with hundreds of corners, a strategy that was dismissed above as impractical.  The concerns above, however, were that it is impractical to send hundreds of points describing the users area of interest from the interface to the server, and it is impractical to store hundreds of points describing the data coverage area for each and every data granule in the database.  Now, however, information about the projection is being sent from the interface to the server, and stored in the database for the server to retrieve.  With this information the server itself can compute the points it needs on the fly and use some polygon comparison heuristic to compare the two areas, which are now terribly well defined, on a common projection.  Since these hundreds of points are neither sent nor stored the objections above don't apply.  Accuracy can be increased arbitrarily by increasing the resolution of both maps and accepting the consequent performance hit.

Another good way to handle the area comparisons is to employ a tiling scheme.  In this scheme the Earth is sliced up into "tiles" and the search algorithm determines the tiles covered by each area.  The two lists are then compared.  Since it is comparing lists, not areas, this scheme doesn't care if the two areas are not on a common projection.  Accuracy can be increased arbitrarily by making the tiles smaller and accepting the consequent performance hit.

Unfortunately determining the tiles covered by a given area on a given projection is often just as difficult as converting it into the corresponding area on a new projection.  Instead of converting from (x, y) on the original projection to (lat, lon) on the earth to (x, y) on the new projection one must convert from (x, y) on the original projection to (lat, lon) on the earth to some tile identifier.  The tile identifier could be anything because the tiles themselves could be completely arbitrary.  Indeed one way to tile the earth is to draw random closed curves on a globe and give all the areas of intersection a unique identifier.  The transformations from (lat, lon) to tile identifier can then be accomplished via a simple lookup table.
 
Obviously that's not a terribly good way to tile the earth if one wishes to control the accuracy of the search. Because the lookup table is static the only way to increase the accuracy of the search is to create another tiling, with smaller tiles, and another lookup table, with more entries. In order to provide a range of accuracy levels a "complete" set of tilings and lookup tables would be required which can quickly become cumbersome.

What's needed, then, is a methodical, extensible, tiling.  Instead of random closed curves on a sphere some extensible method of  drawing non-random closed curves is needed.  Quad trees, mentioned above, is one such method. Quad trees use an iterative approach to slice the sphere into four tiles, then sixteen, then sixty-four, and so on.9  Because it's iterative the quad tree method can make the tiles as small and numerous as desired on the fly.  Additionally the quad tree tiling scheme has the advantage of being relatively fast in that it can eliminate up to three fourths of the tiles, because they do not overlap the target area, during any given iteration and ignore them in future iterations.

Unfortunately the code required to do the transformation from (lat, lon) to tile identifier using quad trees is not as readily available as the map transformation code.  Consequently adopting the quad tree scheme probably entails learning about quad trees and writing the code from scratch.  Moreover, even if the code is available to cannibalize, there is still some extra cost associated with incorporating more new code into the server.  In other words the quad tree scheme requires that the server know about maps projections and that it know about quad trees.

An alternative extensible method of drawing non-random closed curves on the sphere is the familiar (lat, lon) coordinate system.  Using (lat, lon) grid cells as the tiles the tiles can be made as small and numerous as desired (limited to the accuracy of the transformation functions) by controlling the round off of the coordinates.  Moreover, if the tile identifier is just the (lat, lon) coordinate of the center, the (lat, lon) to tile identifier transformation is completely unnecessary.  Furthermore, (lat, lon) grid cells are smaller and more numerous the nearer they are to the pole which means this particular tiling scheme is more accurate in the very region where the area comparisons are most problematic.

The primary disadvantage of this tiling scheme is performance may be poor.  Unlike the quad tree scheme, which only has to perform the (x, y) to (lat, lon) transformation on the corner and edge points of the target area in order to translate the area to the sphere, the (lat, lon) tiling scheme has to perform the (x, y) to (lat, lon) transformation on every point within the target area as well.  So while the (lat, lon) to tile identifier step is eliminated the number of points that must be converted from (x, y) to (lat, lon) is increased by an order of magnitude.  But, as noted above, "poor" is a relative term.  If the server is run on a sufficiently fast machine this performance hit may mean a single area comparison takes a microsecond instead of a millisecond.
 

Conclusion

The recurring theme throughout this paper has been that being explicit about the projection is not a panacea, but it helps.  The problems discussed above arise because the lat/lon bounding box (and lat/lon polygon) terminology masks the importance of the projection.  Using the terms "lat" and "lon" to describe the (x, y) coordinates of the ECE projection causes an unhealthy, and often unrecognized,  reliance on the ECE projection which cripples the development effort from the beginning.  The x/y bounding box has the singular advantage of making it obvious what's going on.  Once the paradigm has shifted from the lat/lon bounding box to the x/y bounding box the importance of the projection becomes clear.  Often this is interpreted as adding to the confusion unnecessarily but in fact the x/y bounding box merely points out the source of the confusion.

The source of the confusion is the perception developers with little cartographic experience have that by using (lat, lon) coordinates they are using "Earth" coordinates and consequently are "projection neutral".  This unrecognized reliance on the ECE projection is worse than if it were just explicitly stated up front.  When problems arise exotic, or sometimes capricious, "solutions" are implemented because the cause of the problem is misidentified.  And since the cause is misidentified the "solutions" are merely work-arounds that tend to cause more problems later on.

False negatives can be completely eliminated by making sure the area used to represent the target area (the user's area of interest or the data coverage area) is always larger than the target area itself.  Unfortunately no similarly absolute method exists for eliminating false positives.  False positives can be minimized, however, if the interface sends information about the user's chosen projection to the server and information about the projection the data are in is stored in the database for the server to retrieve.  The server can then be made to use this information to perform more accurate area comparisons in a variety of ways:

Unfortunately all these options require making the entire system, and especially the server, aware of map projections.  Consequently there will be some additional costs incurred in terms of both programming time and search time.  We live on a three dimensional planet but the maps the users draw their area of interest on, and the data themselves, are two dimensional.  So both the search criteria and the search results are projected.  Ignore the projection at your own risk.

Notes

  1. Knowles, Kenneth W., 19?? "Points, Pixels, Grids, and Cells; A Mapping and Gridding Primer", National Snow and Ice Data Center, http://www-nsidc.colorado.edu/NASA/GUIDE/docs/reference_documents/ppgc.html.   <back>
  2. E.g.: a polygon, a circle, etc.  This paper will only discuss rectangles since, mutatis mutandis, the same arguments apply to other shapes.  <back>
  3. Snyder, John P. 1987. Map Projections, A Working Manual. U. S. Geological Survey Professional Paper 1395. Department of the Interior.  Washington, D.C. p. 89.  <back>
  4. Except for the Mercator projection, with which the ECE projection is often confused, and other cylindrical projections.  <back>
  5. Depending on what the interests of the scientists involved are, and what the needs of the Captain are.  For example the scientists may want the plane to fly a straight line but weather conditions may force the Captain to alter course.  Alternatively the scientists may be interested in some particular feature of the terrain which would dictate a non-linear flight path.   <back>
  6. This rule is true for Mercator, Sinusoidal, Azimuthal, Orthographic, and Polar Stereographic projections among others.  In these projections the graticule lines are all continuous so any line inside the rectangle must cross an edge unless the rectangle completely encloses that graticule.  But if a rectangle completely encloses a given graticule, say 85 degrees north, the pole must also be inside the rectangle because 85 degrees north surrounds the pole.  Some odd projections for which this rule is not true are: Goode's Interrupted Homolosine and the Interrupted Sinusoidal, both of which are discontinuous.   <back>
  7. E.g.: The eight Distributed Active Archive Centers involved in NASA's Earth Observing System - Data Inventory System, The eight data providers hosting the Naval Research Laboratory's Master Environmental Library, The twelve international partners involved in the CINTEX project, and the many Data Centers involved in the NOAA-Server project.   <back>
  8. Some good map transformation libraries are: Proj4 and GCTP, both available from the U.S. Geological Survey, and mapx, available from the National Snow and Ice Data Center. <back>
  9. For a tiling scheme that results in square tiles of equal area (save for those tiles that touch the pole) see:  Tobler, Waldo and Zi-tan Chen, 1986, "A Quadtree for Global Information Storage." Geographical Analysis, Vol. 18 No. 4 (October), Ohio State University Press, pp. 360-371.    <back>