The Canada Centre for Remote Sensing recently completed and published the Ground Control DataBase (GCDB), a collection of over 37000 geo-corrected high-accuracy images of the Canadian landmass, derived from aerial photography.
The CEONet system was chosen as the web-based infrastructure to provide nation-wide user-level access and search/browse functionality to the dataset. CEONet provides a geo-spatial Internet gateway of data, tools and expertise.
Initial concerns from the Geomatics community included:The objective of this study is to design access mechanisms to point elevation data that users can download or access and view quickly in large quantities if desired. The XML protocol will also be explored as a means of effective data transition / dissemination.
Existing Layout / Data AccessThe CEONet implementation of GCDB, designed for light online browsing, lacks the bulk download capacity of other GIS data depot-type websites.
The basic layout of the GCDB dataset on CEONet is as follows:Data Type | Geo-corrected aerial photos with ground control points |
Data Format | GeoTIFF image files with corresponding textual geo-positional information; JPEG files for web viewing |
Access Nodes | CEONet: viewable over Web API, downloadable by chip ID |
A full FGDC metadata implementation is implemented within the CEONet dataset description of GCDB, with further details on accuracy, precision, origination, distribution, etc. The metadata is based on the FGDC Content Standards for Digital Geospatial Metadata.
Data StorageThe size of the entire GCDB dataset, including imagery and attribute files, is over 10GB. This quantity may be unmanageable for the average user, and may be awkward to load all the imagery data into a GIS environment. CEONet helps display the data geospatially, but still lacks a true GIS environment. The figure below represents the physical access nodes of GCDB:
As the diagram indicates, there is no forward communication between a user desktop and the various access nodes of GCDB. The user can search, browse, and access data from the nodes, but cannot make requests from their internal GIS environment, unless they download the entire dataset, including point info and imagery. Also, the 'single product' request illustrates lack of multiple, efficient data product viewing/downloading.
What is needed here is, given the existing infrastructure, is a user-access architecture mechanism so users can utilize the GCDB in a variety of application environments and make use of the data in a variety of ways.
Enter GeoGratis - IntegrationGeoGratis is CCRS' web and file transfer site that distributes geo-spatial data in large quantities. Users can download data through HTTP or FTP protocols.
Data on this website is without charge, a perfect home for the public domain GCDB. Taking into account the open-ended design of the site, figure 2 illustrates the architecture prototype. This design, utilizing GeoGratis, proves more flexible for the end-user of the dataset, providing options to download data in large volumes, while still maintaining linkages to the CEONet and GCDB servers.
Acceptance of this design enabled a download area page with the following entries:
All data is available for both the entire dataset and by province, in the form of point locations. The next step is the actual data reformatting, structuring and dissemination.
Setting up GCDB on GeoGratis - Data LayoutThe GCDB attribute records were delivered to CEONet as a comma-delimited text file. CEONet takes the textual information and calls a CGI program to display the data online with the text records as input parameters. Since some fields were CEONet-specific, it was decided to filter the input file to make a cleaner output file.
Designing the original output data format as well as the CGI program for CEONet gave a head start on designing output products for GeoGratis.
Here's the layout of the GCDB data from the input file, as a classic C struct
:
typedef struct { /* DESCRIPTION */ /*-----------------------------------------------------------*/ char *Image_id[]; /* filename; unique identifier */ char *ac_date[]; /* acquisition date of raw photo */ char *Image_nts[]; /* corresponding NTS mapsheet */ int no_lines; /* number of rows in image */ int no_pixels; /* number of columns in image */ float UL_lat; /* Upper left latitude */ float UL_long; /* Upper left longitude */ float UR_lat; /* Upper right latitude */ float UR_long; /* Upper right longitude */ float LR_lat; /* Lower right latitude */ float LR_long; /* Lower right longitude */ float LL_lat; /* Lower left latitude */ float LL_long; /* Lower left longitude */ int GCP_line; /* GCP row location */ int GCP_pixel; /* GCP column location */ float GCP_lat; /* GCP latitude */ float GCP_long; /* GCP longitude */ int GCP_elev; /* GCP elevation */ float GCP_errX; /* GCP x error */ float GCP_errY; /* GCP y error */ float GCP_errZ; /* GCP z error */ float LL_UTM_northing; /* Lower left northing */ float LL_UTM_easting; /* Lower left easting */ int utm_zone; /* corresponding UTM zone for chip */ } GCDB_chip_rec; |
At this point, all the information is cleanly laid out to parse as desired.
Perl to the RescuePerl is a beautiful programming language specializing in text processing and reporting. Perl's powerful text parsing functionality is a perfect fit to reformat the GCDB records. The applicable data were extracted to create a filtered output table. Simultaneously, all fields were used to create a new field and records with an URL value, corresponding to the CGI program's API requirements to display the chip information correctly. This field's usefulness will become more apparent later in the paper.
Here's the Perl script to reformat the data for GeoGratis:
1 #!/usr/bin/perl -w 2 3 use strict; 4 5 my $server_string = "http://lambert.ccrs.nrcan.gc.ca"; 6 7 open(IN,"out.filetext.2") or die "$!\n"; 8 9 open(OUT,">gcdb_allpts.txt") or die "$!\n"; 10 11 print OUT "ImageID,AcqDate,NTSSheet,LatDD,LongDD,Elev, 12 errX,errY,errZ,hotlink\n"; 13 14 while( |
Lines 7-11 open the input and output files, and writes out the field headers. Lines 14-16 open the file to read line-by-line and extract the variables. Line 22 builds a URL that corresponds to the GCDB API on the CCRS server. Lines 27-28 print out the records to file
At this point, we have a fully functional text file with all the records needed.
ArcView
The file was imported into ArcView ( Add Table
), then into a View ( Theme | Add Event Theme
), to map the XY positions. The resulting coverage was then saved as a shapefile for good measure.
To create various data profiles, the newly created shapefile was given base polygon shapefile layers of Canada, parsed by province and territory (including Nunavut). The Theme | Select By Theme function was then used for each region to output regional point shapefiles. Resulting regional views were exported to image files for basic overview imagery.
GCDB Sample ArcView ScriptingThe project file attached illustrates a sample GCDB environment, with the concepts of this document, along with some sample Avenue scripts to showcase the data. The GCDB point data for Ontario is included in the project.
The ArcView GIS environment supplies two sample scripts under the user-defined GCDB menu:
GCDB | Point SummaryThis script is available through both the view menu and view popup dialogs. It takes the user selected point data and runs some basic statistical calculations on them to derive frequency and average values. The values returned help give a general overview of the terrain in the region of interest selected.
GCDB | FindGCDBPointsThis dialog, constructed with the ESRI Dialog Designer, prompts the user for a detailed query specific to the GCDB dataset, such as custom elevation and RMS error field values extracted from the data. If a selection is made on the dataset in the view, the dialog will query the selection, else the dialog will query the entire set. The dialog returns the selected data (if any) to the view for hotlinking to the GCDB API.
Of course, the data is open to numerous possibilities including grid / raster development (interpolation, hillshading, etc.), and beyond the scope of this paper.
As a result, the GeoGratis website was now complete with full and partial dataset records (textual attributes) of the GCDB, as per figure 2. With the data properly parsed and organized, users visiting the GeoGratis site can now quickly:As illustrated, users can also import the data into their GIS environment relatively seamlessly. The data format and parsing is clean, concise and documented openly for efficient operations.
All functionality is available for both the entire dataset and by province, in the form of point locations (note: as of this writing, the above page design was submitted as prototype, and may or may not be altered by CCRS' publishing team; functionality will remain consistent).
Although the data points serve useful for a variety of applications, they were initially developed with corresponding imagery for orientation. For example, when trying to correlate a data point onto map layers, it is helpful to have reference imagery to denote a known point visible to the user. The text used for the data download was derived from the metadata files. The average size of a browse image, in GeoTIFF format is 172KB.
172KB/image * 37439 products = 6.43GB
The size of the data, including the imagery, proved too large for users to download in bulk quantities in terms of bandwidth and network traffic. However, the dataset is more functional when the imagery is integrated to display the point data. The GIS environment is configured, now what is needed is to link the imagery to the point data. ESRI ArcView GIS will be used to illustrate.
Web IntegrationThe design architecture was built such that the user can make dynamic requests to GCDB from their point shapefiles, to view the image data online, without downloading any image data. The diagram below shows the framework:
The request is sent through an Avenue script within the ArcView environment. The URL written out earlier in the output text files (and subsequent shapefiles) represents a web address to the GCDB CGI program which displays the data in JPEG format to a web browser, along with positional and other ancillary information. All imagery data is stored on the CCRS GCDB data server.
What follows is a detailed look at how this happens: Arc
View Hotlink SetupTo setup and configure the hotlink properties of the point theme to enable the URL field we wrote out earlier. The URL field represents a web address to the GCDB CGI program.
Now, you can see the button is active when the point theme is active. Clicking on a point will call Avenue, which in turn calls operating system's default web browsing software and open the URL specified in the 'hotLink' field of the theme's attribute table.
Here's the Avenue script that triggers the CGI program:
1 ' default window header 2 3 myHdr = "HotLink to the Internet" 4 5 ' Test Operating system type, setup DLL's for appropriate system. 6 ' Works with Win32 only 7 8 if (System.GetOSVariant = #SYSTEM_OSVARIANT_MSWNT) then 9 dllShell32 = DLL.Make("C:\winnt\system32\shell32.dll".AsFileName) 10 dllUser32 = DLL.Make("C:\winnt\system32\user32.dll".AsFileName) 11 elseif (System.GetOSVariant = #SYSTEM_OSVARIANT_MSW95) then 12 dllShell32 = DLL.Make("C:\windows\system\shell32.dll".AsFileName) 13 dllUser32 = DLL.Make("C:\windows\system\user32.dll".AsFileName) 14 end 15 if ((dllShell32 = nil) or (dllUser32 = nil)) then 16 MsgBox.Error("Can't find required DLL's. Check your setup.", myHdr) 17 exit 18 end 19 20 ' setup Win32API to talk with Avenue 21 22 dpGetActiveWindow = DLLProc.Make(dllUser32, "GetActiveWindow",#DLLPROC_TYPE_INT32, {}) 23 24 dpShellExecute = DLLProc.Make ( 25 dllShell32, "ShellExecuteA", 26 #DLLPROC_TYPE_INT32, { 27 #DLLPROC_TYPE_INT32, 28 #DLLPROC_TYPE_STR, 29 #DLLPROC_TYPE_STR, 30 #DLLPROC_TYPE_STR, 31 #DLLPROC_TYPE_STR, 32 #DLLPROC_TYPE_INT32 33 } 34 ) 35 36 ' Get window handle of ArcView window 37 38 activeWin = DLL.GetAVWindowHandle 39 40 ' Get the URL off clicked area of theme 41 42 hotLink = SELF 43 44 ' Send info to default browser 45 46 hotLinkToBrowser = dpShellExecute.Call({ActiveWin, "Open", hotLink, myHdr, FileName.GetCWD.AsString, 1}) 47 48 if (hotLinkToBrowser <= 32) then 49 MsgBox.Warning("Hotlink failed", myHdr) 50 end |
The GCDB CGI program takes in a strict ruleset of parameters to display the image. An example incoming URL looks like:
http://lambert.ccrs.nrcan.gc.ca/cgi-bin/gcdb/viewimage2.cgi?data=N50d42mW109d60m_1:397:415:198:199:729:0.38:0.58:0.23:5612420.0:568110.0:12:1991-06-10:72K12
..which decodes to:
data= *Image_id[] no_lines no_pixels GCP_line GCP_pixel GCP_lat GCP_long GCP_elev GCP_errX GCP_errY GCP_errZ LL_UTM_northing LL_UTM_easting utm_zone *ac_date[] *Image_nts[] |
The CGI program accepts a parameter named 'data' and uses the values to build the view window to display the chip correctly. Values are comma-separated to escape ArcView's table text import using commas, and to split the values in the GCDB API.
The program then converts its output mode to text/html to write to the browser as a DHTML document.
DHTML, CSS, JavaScript - Geographic Page RenderingWhen the page initially loads, a browser check determines the resizing parameters to display the image and page information with correct dimensions. The image width and height determine the placement of the image, and what the offsets of other entities within the page are, positioned relative to the image.
The input boxes above the image display the geographic position of the area clicked or hovered over by the user mouse. Clicking the image will dynamically place a tiny layer to display the click position, also updating the click input boxes. The dynamic layering is done with JavaScript, building dynamic layers when needed.
The actual position processing is interesting, as the GCDB was processed in UTM, however the desired output coordinate format was geographic decimal degrees. Initially, it was planned to compute the four corners of the image, then derive the lat/long from the corners. However, the pixel spacing would become distorted with the shifting of coordinates without reprocessing the image itself.
It was decided to use the lower left UTM coordinates, passed from the preceding Avenue code, to first identify the position in UTM along the image:
var tmpx = ((LL_UTM_northing + (no_lines - 1) * res)) - (y * res); var tmpy = LL_UTM_easting + (x * res); |
The updated coordinates are then passed to a JavaScript function to convert UTM to Geographic. Since JavaScript code is always downloaded to the client's browser cache, the server would not incur the cost of processing, resulting in a fast web application. Here's a look at the function call.
var GeoXY = new Array(); GeoXY = utm2ll(tmpx,tmpy,utm_zone); |
...and the routine:
function utm2ll(UTMNorthing, UTMEasting, UTM_Zone) { var PI = 3.1415926536; var k0 = 0.9996; var a = 6378137.0; var eccSquared = 0.00669438; var eccPrimeSquared; var e1 = (1-Math.sqrt(1-eccSquared))/(1+Math.sqrt(1-eccSquared)); var N1, T1, C1, R1, D, M; var LongOrigin; var mu, phi1, phi1Rad; var x, y; var ZoneNumber; var NorthernHemisphere; //1 for northern hemispher, 0 for southern x = UTMEasting - 500000.0; //remove 500,000 meter offset for longitude y = UTMNorthing; ZoneNumber = UTM_Zone; NorthernHemisphere = 1;//point is in northern hemisphere LongOrigin = (ZoneNumber - 1)*6 - 180 + 3; //+3 puts origin in middle of zone eccPrimeSquared = (eccSquared)/(1.0-eccSquared); M = y / k0; mu = M/(a*(1.0-eccSquared/4-3*eccSquared*eccSquared/64-5*eccSquared*eccSquared* eccSquared/256)); phi1Rad = mu + (3*e1/2-27*e1*e1*e1/32)*Math.sin(2*mu) + (21*e1*e1/16-55*e1*e1*e1*e1/32)*Math.sin(4*mu)+(151*e1*e1*e1/96)*Math.sin(6*mu); phi1 = phi1Rad * (180 / PI); N1 = a/Math.sqrt(1-eccSquared*Math.sin(phi1Rad)*Math.sin(phi1Rad)); T1 = Math.tan(phi1Rad)*Math.tan(phi1Rad); C1 = eccPrimeSquared*Math.cos(phi1Rad)*Math.cos(phi1Rad); R1 = a*(1-eccSquared)/Math.pow(1-eccSquared*Math.sin(phi1Rad)*Math.sin(phi1Rad), 1.5); D = x/(N1*k0); var GeoLat = phi1Rad - (N1*Math.tan(phi1Rad)/R1) * (D*D/2-(5+3*T1+10*C1-4*C1*C1-9*eccPrimeSquared)*D*D*D*D/24+ (61+90*T1+298*C1+45*T1*T1-252*eccPrimeSquared-3*C1*C1)*D*D*D*D*D*D/720); GeoLat = GeoLat * (180 / PI); var GeoLong = (D-(1+2*T1+C1)*D*D*D/6+ (5-2*C1+28*T1-3*C1*C1+8*eccPrimeSquared+24*T1*T1)*D*D*D*D*D/120)/Math.cos(phi1Rad); GeoLong = LongOrigin + (GeoLong) * (180 / PI); var Coords = new Array(GeoLat,GeoLong); return Coords; } |
The function is a JavaScript implementation is taken from the algorithm in "Map Projections--A Working Manual", by Snyder and Parr.
As a result, standard geographic coordinates are always returned to the user window.
The online application also displays the ground control point, which the image (and the shapefile data) is based on. The GCP line and pixel are passed by the Avenue script, which then runs the same routine to derive the lat/long of the GCP. This function also displays a dynamic layer with the coordinates displayed along with the tiny layer selection. The user can toggle to display or hide this point.
The image below shows this as an example in a live situation. For hometown's sake, I chose a GCP in the downtown Toronto area. Click on the image below to see the full view:
This enables the user a more interactive GIS session, with the least amount of data redundancy and storage.
Hands-Off Total Network Data ConnectivityA full approach to data centralization and dynamic data exchange using network and database protocols involves a driver and database setup for the point data. What I did was make a MySQL database onto a central server of the entire dataset. Then, I setup access permissions to users globally with username/password privileges for data selection only. So the central host server has all the data in lieu of users downloading the data to their desktop. Now, I setup my ODBC connection with a MySQL database driver, then adding a user DSN, inputting the hostname, database, username and password as parameters, so that a connection can be made through the ArcView Project | SQL Connect function. Below is the access mechanism one would encounter with this architecture.
This approach takes absolutely *no* data to the user's desktop, all data is connected through the central database. With a plain ArcView client, the GCDB is now fully accessible just as if the had the data downloaded onto their own hard-drive.
So, at this point we have successfully weaved ArcView GIS and the Web. ArcView can successfully display and project the point elevation data. The resulting theme is configured for hotlinkable objects, linking to an Avenue script, which calls the default web browser with the specified URL. The URL given, a CGI script takes the parameters from the query string and successfully displays and resizes the page according to the dimensions of the imagery. The parameters passed also aid in setting up the layering effects of the output webpage for dynamic geo-position information.
Embracing XML - Revisiting GCDB MetadataThe GCDB metadata record was compiled with input from various project members and project logs/information. Presently, the metadata is posted on the web as plain text. Using eXtensible Markup Language (XML), one can format the data in any way they wish, defining their own document type definitions. XML exceeds HTML in that the user can define their own tags, entities and attributes, creating 'self-describing data'. Geographic Markup Language (or GML) is closely related to XML, but creates geographic entities (geo-objects) for flexible data interchange in GIS environments. More information on the GML specification is at www.opengis.org.
Here is an XML formatted document of the GCDB metadata information for the entire dataset. Only required fields were used for brevity. XML is flexible for defining your own documents with a Document Type Definition (DTD). This way, organizations can interchange data openly and clearly in accordance to the XML and DTD specifications for validity. Again, with the open-ended architecture, the possibilities are endless.
ConclusionThis is a good example of how some programming and Web application development can work well with existing GIS environments, connecting to databases and spatial information using the Internet.
Technologies UsedThe entire representation will be available at the GeoGratis site at http://geogratis.cgdi.gc.ca/