GEOWeb: Hybrid InterProcess Communication


Scope

The Canada Centre for Remote Sensing recently completed and published the Ground Control DataBase (GCDB), a collection of over 37000 geo-corrected high-accuracy images of the Canadian landmass, derived from aerial photography.

The CEONet system was chosen as the web-based infrastructure to provide nation-wide user-level access and search/browse functionality to the dataset. CEONet provides a geo-spatial Internet gateway of data, tools and expertise.

Initial concerns from the Geomatics community included: Given this feedback, the objectives were clear in order to improve GCDB's applicability and usability:

The objective of this study is to design access mechanisms to point elevation data that users can download or access and view quickly in large quantities if desired. The XML protocol will also be explored as a means of effective data transition / dissemination.

Existing Layout / Data Access

The CEONet implementation of GCDB, designed for light online browsing, lacks the bulk download capacity of other GIS data depot-type websites.

The basic layout of the GCDB dataset on CEONet is as follows:
Data Type Geo-corrected aerial photos with ground control points
Data Format GeoTIFF image files with corresponding textual geo-positional information; JPEG files for web viewing
Access Nodes CEONet: viewable over Web API, downloadable by chip ID

A full FGDC metadata implementation is implemented within the CEONet dataset description of GCDB, with further details on accuracy, precision, origination, distribution, etc. The metadata is based on the FGDC Content Standards for Digital Geospatial Metadata.

Data Storage

The size of the entire GCDB dataset, including imagery and attribute files, is over 10GB. This quantity may be unmanageable for the average user, and may be awkward to load all the imagery data into a GIS environment. CEONet helps display the data geospatially, but still lacks a true GIS environment. The figure below represents the physical access nodes of GCDB:

[Existing GCDB Access Nodes]
Figure 1 - Existing GCDB Access Nodes

As the diagram indicates, there is no forward communication between a user desktop and the various access nodes of GCDB. The user can search, browse, and access data from the nodes, but cannot make requests from their internal GIS environment, unless they download the entire dataset, including point info and imagery. Also, the 'single product' request illustrates lack of multiple, efficient data product viewing/downloading.

What is needed here is, given the existing infrastructure, is a user-access architecture mechanism so users can utilize the GCDB in a variety of application environments and make use of the data in a variety of ways.

Enter GeoGratis - Integration

GeoGratis is CCRS' web and file transfer site that distributes geo-spatial data in large quantities. Users can download data through HTTP or FTP protocols.

Data on this website is without charge, a perfect home for the public domain GCDB. Taking into account the open-ended design of the site, figure 2 illustrates the architecture prototype. This design, utilizing GeoGratis, proves more flexible for the end-user of the dataset, providing options to download data in large volumes, while still maintaining linkages to the CEONet and GCDB servers.

[GCDB Design Architecture]
Figure 2 - GCDB Design Architecture

Acceptance of this design enabled a download area page with the following entries:

All data is available for both the entire dataset and by province, in the form of point locations. The next step is the actual data reformatting, structuring and dissemination.

Setting up GCDB on GeoGratis - Data Layout

The GCDB attribute records were delivered to CEONet as a comma-delimited text file. CEONet takes the textual information and calls a CGI program to display the data online with the text records as input parameters. Since some fields were CEONet-specific, it was decided to filter the input file to make a cleaner output file.

Designing the original output data format as well as the CGI program for CEONet gave a head start on designing output products for GeoGratis.

Here's the layout of the GCDB data from the input file, as a classic C struct:

typedef struct {         /* DESCRIPTION                      */
/*-----------------------------------------------------------*/
  char  *Image_id[];     /* filename; unique identifier      */
  char  *ac_date[];      /* acquisition date of raw photo    */
  char  *Image_nts[];    /* corresponding NTS mapsheet       */
  int   no_lines;        /* number of rows in image          */
  int   no_pixels;       /* number of columns in image       */
  float UL_lat;          /* Upper left latitude              */
  float UL_long;         /* Upper left longitude             */
  float UR_lat;          /* Upper right latitude             */
  float UR_long;         /* Upper right longitude            */
  float LR_lat;          /* Lower right latitude             */
  float LR_long;         /* Lower right longitude            */
  float LL_lat;          /* Lower left latitude              */
  float LL_long;         /* Lower left longitude             */
  int   GCP_line;        /* GCP row location                 */
  int   GCP_pixel;       /* GCP column location              */
  float GCP_lat;         /* GCP latitude                     */
  float GCP_long;        /* GCP longitude                    */
  int   GCP_elev;        /* GCP elevation                    */
  float GCP_errX;        /* GCP x error                      */
  float GCP_errY;        /* GCP y error                      */
  float GCP_errZ;        /* GCP z error                      */
  float LL_UTM_northing; /* Lower left northing              */
  float LL_UTM_easting;  /* Lower left easting               */
  int   utm_zone;        /* corresponding UTM zone for chip  */
} GCDB_chip_rec;

At this point, all the information is cleanly laid out to parse as desired.

Perl to the Rescue

Perl is a beautiful programming language specializing in text processing and reporting. Perl's powerful text parsing functionality is a perfect fit to reformat the GCDB records. The applicable data were extracted to create a filtered output table. Simultaneously, all fields were used to create a new field and records with an URL value, corresponding to the CGI program's API requirements to display the chip information correctly. This field's usefulness will become more apparent later in the paper.

Here's the Perl script to reformat the data for GeoGratis:

 1  #!/usr/bin/perl -w
 2
 3  use strict;
 4
 5  my $server_string = "http://lambert.ccrs.nrcan.gc.ca";
 6
 7  open(IN,"out.filetext.2") or die "$!\n";
 8
 9  open(OUT,">gcdb_allpts.txt") or die "$!\n";
10
11  print OUT "ImageID,AcqDate,NTSSheet,LatDD,LongDD,Elev,
12  errX,errY,errZ,hotlink\n";
13
14  while() {
15    chomp;
16    my($image_id,$acq_date,$nts_sheet,$no_lines,$no_pixels,
17    $UL_lat,$UL_long,$UR_lat,$UR_long,$LR_lat,$LR_long,
18    $LL_lat,$LL_long,$GCP_line,$GCP_pixel,$GCP_lat,$GCP_long,
19    $GCP_elev,$GCP_errX,$GCP_errY,$GCP_errZ,$LL_UTM_northing,
20    $LL_UTM_easting,$utm_zone) = split /,/, $_;
21
22    my $url = "$server_string/cgi-bin/gcdb/viewimage2.cgi?
23    data=$image_id:$no_lines:$no_pixels:$GCP_line:$GCP_pixel:
24    $GCP_elev:$GCP_errX:$GCP_errY:$GCP_errZ:$LL_UTM_northing:
25    $LL_UTM_easting:$utm_zone:$acq_date:$nts_sheet";
26
27    print OUT "$image_id,$acq_date,$nts_sheet,$GCP_lat,
28    $GCP_long,$GCP_elev,$GCP_errX,$GCP_errY,$GCP_errZ,$url\n";
29
30  }
31
32  close(IN);
33  close(OUT);
  

Lines 7-11 open the input and output files, and writes out the field headers. Lines 14-16 open the file to read line-by-line and extract the variables. Line 22 builds a URL that corresponds to the GCDB API on the CCRS server. Lines 27-28 print out the records to file

At this point, we have a fully functional text file with all the records needed.

ArcView

The file was imported into ArcView ( Add Table ), then into a View ( Theme | Add Event Theme ), to map the XY positions. The resulting coverage was then saved as a shapefile for good measure.

[GCDB in ArcView GIS]
Figure 3 - GCDB in ArcView GIS

To create various data profiles, the newly created shapefile was given base polygon shapefile layers of Canada, parsed by province and territory (including Nunavut). The Theme | Select By Theme function was then used for each region to output regional point shapefiles. Resulting regional views were exported to image files for basic overview imagery.

GCDB Sample ArcView Scripting

The project file attached illustrates a sample GCDB environment, with the concepts of this document, along with some sample Avenue scripts to showcase the data. The GCDB point data for Ontario is included in the project.

The ArcView GIS environment supplies two sample scripts under the user-defined GCDB menu:

GCDB | Point Summary

This script is available through both the view menu and view popup dialogs. It takes the user selected point data and runs some basic statistical calculations on them to derive frequency and average values. The values returned help give a general overview of the terrain in the region of interest selected.

[Figure 4 - Point Summary Script]

GCDB | FindGCDBPoints

This dialog, constructed with the ESRI Dialog Designer, prompts the user for a detailed query specific to the GCDB dataset, such as custom elevation and RMS error field values extracted from the data. If a selection is made on the dataset in the view, the dialog will query the selection, else the dialog will query the entire set. The dialog returns the selected data (if any) to the view for hotlinking to the GCDB API.

[Figure 5 - FindGCDBPoints GUI]

Of course, the data is open to numerous possibilities including grid / raster development (interpolation, hillshading, etc.), and beyond the scope of this paper.

As a result, the GeoGratis website was now complete with full and partial dataset records (textual attributes) of the GCDB, as per figure 2. With the data properly parsed and organized, users visiting the GeoGratis site can now quickly:

As illustrated, users can also import the data into their GIS environment relatively seamlessly. The data format and parsing is clean, concise and documented openly for efficient operations.

[GeoGratis Web Display]
Figure 4 - GeoGratis Web Display

All functionality is available for both the entire dataset and by province, in the form of point locations (note: as of this writing, the above page design was submitted as prototype, and may or may not be altered by CCRS' publishing team; functionality will remain consistent).

Although the data points serve useful for a variety of applications, they were initially developed with corresponding imagery for orientation. For example, when trying to correlate a data point onto map layers, it is helpful to have reference imagery to denote a known point visible to the user. The text used for the data download was derived from the metadata files. The average size of a browse image, in GeoTIFF format is 172KB.

172KB/image * 37439 products = 6.43GB

The size of the data, including the imagery, proved too large for users to download in bulk quantities in terms of bandwidth and network traffic. However, the dataset is more functional when the imagery is integrated to display the point data. The GIS environment is configured, now what is needed is to link the imagery to the point data. ESRI ArcView GIS will be used to illustrate.

Web Integration

The design architecture was built such that the user can make dynamic requests to GCDB from their point shapefiles, to view the image data online, without downloading any image data. The diagram below shows the framework:

[User Desktop to GCDB Communication]
Figure 5 - User Desktop to GCDB Communication

The request is sent through an Avenue script within the ArcView environment. The URL written out earlier in the output text files (and subsequent shapefiles) represents a web address to the GCDB CGI program which displays the data in JPEG format to a web browser, along with positional and other ancillary information. All imagery data is stored on the CCRS GCDB data server.

What follows is a detailed look at how this happens: Arc

View Hotlink Setup

To setup and configure the hotlink properties of the point theme to enable the URL field we wrote out earlier. The URL field represents a web address to the GCDB CGI program.

[ArcView Hotlink Setup]
Figure 6 - ArcView Hotlink Setup

Now, you can see the [AV hotlink button]button is active when the point theme is active. Clicking on a point will call Avenue, which in turn calls operating system's default web browsing software and open the URL specified in the 'hotLink' field of the theme's attribute table.

Here's the Avenue script that triggers the CGI program:

 1  ' default window header
 2
 3  myHdr = "HotLink to the Internet"
 4
 5  ' Test Operating system type, setup DLL's for appropriate system.
 6  ' Works with Win32 only
 7
 8  if (System.GetOSVariant = #SYSTEM_OSVARIANT_MSWNT) then
 9    dllShell32 = DLL.Make("C:\winnt\system32\shell32.dll".AsFileName)
10    dllUser32  = DLL.Make("C:\winnt\system32\user32.dll".AsFileName)
11  elseif (System.GetOSVariant = #SYSTEM_OSVARIANT_MSW95) then
12    dllShell32 = DLL.Make("C:\windows\system\shell32.dll".AsFileName)
13    dllUser32  = DLL.Make("C:\windows\system\user32.dll".AsFileName)
14  end
15  if ((dllShell32 = nil) or (dllUser32 = nil)) then
16    MsgBox.Error("Can't find required DLL's.  Check your setup.", myHdr)
17    exit
18  end
19
20  ' setup Win32API to talk with Avenue
21
22  dpGetActiveWindow = DLLProc.Make(dllUser32,
    "GetActiveWindow",#DLLPROC_TYPE_INT32, {})
23
24  dpShellExecute = DLLProc.Make (
25                    dllShell32, "ShellExecuteA",
26                    #DLLPROC_TYPE_INT32, {
27                      #DLLPROC_TYPE_INT32,
28                      #DLLPROC_TYPE_STR,
29                      #DLLPROC_TYPE_STR,
30                      #DLLPROC_TYPE_STR,
31                      #DLLPROC_TYPE_STR,
32                      #DLLPROC_TYPE_INT32
33                    }
34                   )
35
36  ' Get window handle of ArcView window
37
38  activeWin = DLL.GetAVWindowHandle
39
40  ' Get the URL off clicked area of theme
41
42  hotLink = SELF
43
44  ' Send info to default browser
45
46  hotLinkToBrowser = dpShellExecute.Call({ActiveWin, "Open", hotLink, myHdr,
    FileName.GetCWD.AsString, 1})
47
48  if (hotLinkToBrowser <= 32) then
49    MsgBox.Warning("Hotlink failed", myHdr)
50  end
  
GCDB API Setup

The GCDB CGI program takes in a strict ruleset of parameters to display the image. An example incoming URL looks like:

http://lambert.ccrs.nrcan.gc.ca/cgi-bin/gcdb/viewimage2.cgi?data=N50d42mW109d60m_1:397:415:198:199:729:0.38:0.58:0.23:5612420.0:568110.0:12:1991-06-10:72K12

..which decodes to:

  data=
  *Image_id[]
  no_lines
  no_pixels
  GCP_line
  GCP_pixel
  GCP_lat
  GCP_long
  GCP_elev
  GCP_errX
  GCP_errY
  GCP_errZ
  LL_UTM_northing
  LL_UTM_easting
  utm_zone
  *ac_date[]
  *Image_nts[]
  
..as per the GCDB data layout mentioned earlier.

The CGI program accepts a parameter named 'data' and uses the values to build the view window to display the chip correctly. Values are comma-separated to escape ArcView's table text import using commas, and to split the values in the GCDB API.

The program then converts its output mode to text/html to write to the browser as a DHTML document.

DHTML, CSS, JavaScript - Geographic Page Rendering

When the page initially loads, a browser check determines the resizing parameters to display the image and page information with correct dimensions. The image width and height determine the placement of the image, and what the offsets of other entities within the page are, positioned relative to the image.

The input boxes above the image display the geographic position of the area clicked or hovered over by the user mouse. Clicking the image will dynamically place a tiny layer to display the click position, also updating the click input boxes. The dynamic layering is done with JavaScript, building dynamic layers when needed.

The actual position processing is interesting, as the GCDB was processed in UTM, however the desired output coordinate format was geographic decimal degrees. Initially, it was planned to compute the four corners of the image, then derive the lat/long from the corners. However, the pixel spacing would become distorted with the shifting of coordinates without reprocessing the image itself.

It was decided to use the lower left UTM coordinates, passed from the preceding Avenue code, to first identify the position in UTM along the image:

var tmpx =  ((LL_UTM_northing + (no_lines - 1) * res)) - (y * res);
var tmpy =  LL_UTM_easting + (x * res);
  

The updated coordinates are then passed to a JavaScript function to convert UTM to Geographic. Since JavaScript code is always downloaded to the client's browser cache, the server would not incur the cost of processing, resulting in a fast web application. Here's a look at the function call.

var GeoXY = new Array();
GeoXY = utm2ll(tmpx,tmpy,utm_zone);
  

...and the routine:

function utm2ll(UTMNorthing, UTMEasting, UTM_Zone) {
var PI = 3.1415926536;
var k0 = 0.9996;
var a = 6378137.0;
var eccSquared = 0.00669438;
var eccPrimeSquared;
var e1 = (1-Math.sqrt(1-eccSquared))/(1+Math.sqrt(1-eccSquared));
var N1, T1, C1, R1, D, M;
var LongOrigin;
var mu, phi1, phi1Rad;
var x, y;
var ZoneNumber;
var NorthernHemisphere; //1 for northern hemispher, 0 for southern

x = UTMEasting - 500000.0; //remove 500,000 meter offset for longitude
y = UTMNorthing;

ZoneNumber = UTM_Zone;
NorthernHemisphere = 1;//point is in northern hemisphere

LongOrigin = (ZoneNumber - 1)*6 - 180 + 3;  //+3 puts origin in middle of zone

eccPrimeSquared = (eccSquared)/(1.0-eccSquared);

M = y / k0;
mu = M/(a*(1.0-eccSquared/4-3*eccSquared*eccSquared/64-5*eccSquared*eccSquared*
eccSquared/256));

phi1Rad = mu + (3*e1/2-27*e1*e1*e1/32)*Math.sin(2*mu) +
(21*e1*e1/16-55*e1*e1*e1*e1/32)*Math.sin(4*mu)+(151*e1*e1*e1/96)*Math.sin(6*mu);
phi1 = phi1Rad * (180 / PI);

N1 = a/Math.sqrt(1-eccSquared*Math.sin(phi1Rad)*Math.sin(phi1Rad));
T1 = Math.tan(phi1Rad)*Math.tan(phi1Rad);
C1 = eccPrimeSquared*Math.cos(phi1Rad)*Math.cos(phi1Rad);
R1 = a*(1-eccSquared)/Math.pow(1-eccSquared*Math.sin(phi1Rad)*Math.sin(phi1Rad), 1.5);
D = x/(N1*k0);

var GeoLat = phi1Rad - (N1*Math.tan(phi1Rad)/R1) *
(D*D/2-(5+3*T1+10*C1-4*C1*C1-9*eccPrimeSquared)*D*D*D*D/24+
(61+90*T1+298*C1+45*T1*T1-252*eccPrimeSquared-3*C1*C1)*D*D*D*D*D*D/720);

GeoLat = GeoLat * (180 / PI);

var GeoLong = (D-(1+2*T1+C1)*D*D*D/6+
(5-2*C1+28*T1-3*C1*C1+8*eccPrimeSquared+24*T1*T1)*D*D*D*D*D/120)/Math.cos(phi1Rad);

GeoLong = LongOrigin + (GeoLong) * (180 / PI);

var Coords = new Array(GeoLat,GeoLong);
return Coords;
}
  

The function is a JavaScript implementation is taken from the algorithm in "Map Projections--A Working Manual", by Snyder and Parr.

As a result, standard geographic coordinates are always returned to the user window.

The online application also displays the ground control point, which the image (and the shapefile data) is based on. The GCP line and pixel are passed by the Avenue script, which then runs the same routine to derive the lat/long of the GCP. This function also displays a dynamic layer with the coordinates displayed along with the tiny layer selection. The user can toggle to display or hide this point.

The image below shows this as an example in a live situation. For hometown's sake, I chose a GCP in the downtown Toronto area. Click on the image below to see the full view:

[GCDB in a Hybrid GIS/Web Environment]
Figure 7 - GCDB in a Hybrid GIS/Web Environment

This enables the user a more interactive GIS session, with the least amount of data redundancy and storage.

Hands-Off Total Network Data Connectivity

A full approach to data centralization and dynamic data exchange using network and database protocols involves a driver and database setup for the point data. What I did was make a MySQL database onto a central server of the entire dataset. Then, I setup access permissions to users globally with username/password privileges for data selection only. So the central host server has all the data in lieu of users downloading the data to their desktop. Now, I setup my ODBC connection with a MySQL database driver, then adding a user DSN, inputting the hostname, database, username and password as parameters, so that a connection can be made through the ArcView Project | SQL Connect function. Below is the access mechanism one would encounter with this architecture.

[GIS / Web Integration via ODBC]
Figure 8 - GIS / Web Integration via ODBC

This approach takes absolutely *no* data to the user's desktop, all data is connected through the central database. With a plain ArcView client, the GCDB is now fully accessible just as if the had the data downloaded onto their own hard-drive.

So, at this point we have successfully weaved ArcView GIS and the Web. ArcView can successfully display and project the point elevation data. The resulting theme is configured for hotlinkable objects, linking to an Avenue script, which calls the default web browser with the specified URL. The URL given, a CGI script takes the parameters from the query string and successfully displays and resizes the page according to the dimensions of the imagery. The parameters passed also aid in setting up the layering effects of the output webpage for dynamic geo-position information.

Embracing XML - Revisiting GCDB Metadata

The GCDB metadata record was compiled with input from various project members and project logs/information. Presently, the metadata is posted on the web as plain text. Using eXtensible Markup Language (XML), one can format the data in any way they wish, defining their own document type definitions. XML exceeds HTML in that the user can define their own tags, entities and attributes, creating 'self-describing data'. Geographic Markup Language (or GML) is closely related to XML, but creates geographic entities (geo-objects) for flexible data interchange in GIS environments. More information on the GML specification is at www.opengis.org.

Here is an XML formatted document of the GCDB metadata information for the entire dataset. Only required fields were used for brevity. XML is flexible for defining your own documents with a Document Type Definition (DTD). This way, organizations can interchange data openly and clearly in accordance to the XML and DTD specifications for validity. Again, with the open-ended architecture, the possibilities are endless.

Conclusion

This is a good example of how some programming and Web application development can work well with existing GIS environments, connecting to databases and spatial information using the Internet.

Technologies Used

The entire representation will be available at the GeoGratis site at http://geogratis.cgdi.gc.ca/


Tom Kralidis
December 2000