Data Import


Scope
This lab covers issues dealing with data import, examining / determining dataset problems, possible solutions as well as simple table manipulation / analysis.

Software
Vector Analysis MapInfo
Operating System Windows NT

Data Inputs
Region of Interest Cape Breton, Nova Scotia
Projection geographic
Image Data capebret.dxf

Method
The flowchart above illustrates the process involved of this lab, in Gane-Sarson notation. An output map displaying nodes, as well as hardcopy output of the MapInfo Browser window (i.e. table view) are attached at the conclusion of this report.

Updating Columns in MapInfo / Simple SQL
MapInfo exposes a Standard Query Language (SQL) interface to its native data tables. Standard SQL commands, such as select and update, are available to the desktop user. A field was added to the capebret table to house the length of each boundary segment of type float. In standard SQL, the following statement would have the same effect as the point and click interface:

SQL> alter table capebret add (boundLength float NOT NULL);

MapInfo software adds spatial functionality on top of the standard SQL interface, with functions to derive common spatial operations from features (such as distance, length, area, etc.). MapInfo builds a spatial cartridge or layer on top of the existing SQL, and contains spatial data/type information in the obj object. For example, to update each feature ID's boundLength attribute in the capebret table to represent the length of the segment, the following command can be issued:

SQL> update table capebret set (boundLength=ObjectLen(obj,"m"));

Where obj is the object type (i.e. boundary segment), which has a length member).

To identify the longest boundary on the map (i.e. the boundLength field record with the largest value), as always, there's more than one way to do it.

Using SQL:

SQL> select max(boundLength) from capebret;

Or, one can view all records and sort:

SQL> select ID, boundLength from capebret order by boundLength desc;

Or, one can view this information without the need to store the boundLength field:

SQL> select max(ObjectLen(obj,"m")) from capebret;

Using MapInfo:

Choose Select (not Select SQL)
select * from capebret order by boundLength;
navigate to last record;

The ID and length of the longest boundary line in table capebret is:

ID |  boundLength
-----------------------------------
10 |  268,005.144048904
Analysis

When viewing the capebret map, it appears that line nodes are regularly spaced. Also, it appears that nodes appear along features which do not change direction.

What may be causing the problem in this data is the data / topology creation process of the DXF file. It appears that a dense snap or tolerance setting was applied to the data creation process. Therefore, nodes are not representative of the digitizing / feature process, yet are representative of a systematic grid spacing, which may or may not always produce desirable results.

This can be caused where points are created by a set distance within the application while digitizing. Is this case, a straight line can be created with multiple nodes, when simple endpoints suffice, resulting in bulky, redundant data. This can lead to data management issues and problems.

Below are some ways to potentially fix this issue.

Reprocess the source input data
The vectorizing process used for this data collection can be reprocessed by the contributing agency to eliminate such errors, and modifying the topology creation distance tolerances while digitizing / processing.

Remove nodes that are headed in the same direction
A small utility can solve this issue, ensuring that adjacent nodes with identical directional attributes are merged into one. Some pseudo code below aids in conceptualizing:

foreach feature
  foreach node
    if node->x == prevnode->x or node->y == prevnode->y
      delete node->x
      delete node->y
  done
done
  

Using such a utility would transform the line topology as per the figures below.

[before] [after]

This can also be accomplished using MapInfo (when layer is editable), by selecting the desired node deleting it from the feature.



Analytical and Computer Cartography Home