new Open Source Geospatial Book
Check out “Open Source Approaches in Spatial Data Handling” by Hall, Leahy et. al. (disclosure: I did chapter 1). An interesting read covering many facets and tools of open source geospatial.
Check out “Open Source Approaches in Spatial Data Handling” by Hall, Leahy et. al. (disclosure: I did chapter 1). An interesting read covering many facets and tools of open source geospatial.
I’ve been researching tag clouds in the last few days. I think tag clouds can help geospatial search front ends in giving the user a “weighted list”, to get them to what they want quickly and more efficiently.
The following Python script takes a list of terms as input. Such a list can be derived from many things, such as an existing taxonomy, analyzing an httpd log file for commonly used search terms, user votes, and so on. In this (simple) example, we use comma separated input.
By creating a term and count dictionary, this sets up the anatomy of a tag cloud. From here, you can pass this for output to the web (i.e. font sizes, colours, etc.). Here we output this to an APML document, which is often used to represent tag clouds. You can then use tools such as cluztr to generate tag clouds with ease.
Considerations:
#!/usr/bin/python import sys import fileinput import datetime from lxml import etree # term dictionary dTags = {} tn = datetime.datetime.now().isoformat() for line in fileinput.input(sys.argv[1]): aTags = line.strip().split(",") for sTag in aTags: # if term is not in list, add if sTag not in dTags: dTags[sTag] = 1 # else increment term count else: dTags[sTag] += 1 # output as APML document node = etree.Element('APML', nsmap={None: 'http://www.apml.org/apml-0.6'}) node.attrib['version'] = '0.6' subnode = etree.Element('Body') subnode.attrib['defaultprofile'] = 'owscat' subsubnode = etree.Element('Profile') subsubnode.attrib['defaultprofile'] = 'Terms' subsubsubnode = etree.Element('ImplicitData') subsubsubsubnode = etree.Element('Concepts') for term, count in sorted(dTags.iteritems()): termnode = etree.Element('Concept') termnode.attrib['key'] = term termnode.attrib['value'] = str(float(float(count/10.0))) termnode.attrib['from'] = 'owscat' termnode.attrib['updated'] = str(tn) subsubsubsubnode.append(termnode) subsubsubnode.append(subsubsubsubnode) subsubnode.append(subsubsubnode) subnode.append(subsubnode) node.append(subnode) print etree.tostring(node, xml_declaration=True, encoding='UTF-8', pretty_print=True)
I’ve written my share of catalogues, Capabilities parsers, map clients, and context import/export tools to know that having good example WMS instances is paramount in testing functionality and building features. I usually have a handy list of WMS servers which I constantly use when writing code.
Bird Studies Canada provides WMS access to their various bird distribution and abundance data. BSC has taken every effort to:
This WMS is always at the top of my testing list, as well as my first response when people ask to see an existing WMS example which is well constructed, and serves catalogues and search demos very well indeed.
Kudos to BSC!
I embarked on a Google search to find information about Polygon statistics, and low and behold, I posted this on my website years ago.
Goodbye memory!
I’m starting to work on contributing SOS and OWS Common support in OWSLib, a groovy and regimented little GIS Python project.
So far so good; some initial implementations are done (committing soon hopefully, writing tests around these). I think this will add value to the project, seeing that SOS 1.0 has been around long enough to start seeing implementations. And the OWS Common support will act as a baseline for all calling specs/code to leverage.
And it’s been a nice journey in Python for me so far. Another thing I like about this project is the commitment to testing — awesome!
A piece of work I help out with involves the visualization and access of hydrometric monitoring data over the Web. Part of this involves the data management and publishing of voluminous databases of monitoring information.
We use Chameleon for basic visualization and query of the data. Behind the scenes, we run a slew of complex processes (shell scripts via cron) to output the data in a format that can be understood by MapServer (which we use to publish WMS layers). The processes work across many disparate database connections, so outputting them to shapefiles and accessing them locally helps with performance in web mapping apps. ogr2ogr is used exclusively and extensively for the access and format translation.
Well, today I found out that an effort began to write a bunch of scripts to additionally output OGC KML. Thank goodness things didn’t get very far, because the following addition to our processes:
$ ogr2ogr -F KML foo.kml bar.ovf -dsco NameField=NAME -dsco DescriptionField=COMMENT
…worked like a charm, and put a big smile on people’s faces!
So now, OGC KML is also supported for visualization in Earth browsers. Just like that.
Output styles are relatively simple; I’m thinking a -dsco like:
-dsco LayerStyle=LayerName,styles.kml#mystyle
…would point to an existing (local or remote) KML style document style ID via XPointer, i.e.:
<styleUrl>somefile.kml#mystyle</styleUrl>
Of course the default behaviour would be in place if this -dsco is not defined. I’ll see what the GDAL KML gurus think about this.
At any rate, once again, thank you GDAL for being an uber-utility for day-to-day GIS tasks. Happy faces everywhere!
I needed to do some pre-processing of some data which involved transposing column names to values. The condition was that the value for each respective column (frequency count) had to be > 1.
My input was a csv file, and my goal was an output csv file which would feed into a batch database import process.
ID,DA,NL,PHENOM1,PHENOM2,PHENOM3,PHENOM4 233,99,44,0.00,27.00,12.00,0.00
The other interesting bit was that only a range of columns applied to the condition; the other columns represented ancillary data.
Enter Python:
#!/usr/bin/python import sys import csv # open file and read headers fPhenomenon = open("phenomenon.txt","r") sHeaders = fPhenomenon.readline().replace(r'"','') aHeaders = sHeaders.split(",") # feed the rest to csv csvIn = csv.reader(fPhenomenon) csvOut = csv.writer(sys.stdout) for sRowIn in csvIn: aRowOut = [] aPhenomenon = [] aRowOut.append(sRowIn[0]) # procedure ID aRowOut.append(sRowIn[1]) # major drainage area ID for nIndexTupleVal, tupleVal in enumerate(sRowIn[3:-1]): if (float(tupleVal) > 0): # phenomenon measured at least once # add phenomenon name to list aPhenomenon.append(aHeaders[nIndexTupleVal+3]) # add phenomenon list to record aRowOut.append(",".join(aPhenomenon)) csvOut.writerow(aRowOut)
Notes
That’s my hack for the day. Have a good weekend!
UPDATE: ah, the csv module has a .next() method, which can be used instead of the shoemaker attempt I made above to regularize / split / store the header list.
Inspired by the recent thread on FOSS4G history, I started an effort to document MapServer’s history, from its beginnings in the mid-1990s. Check out the progress we’ve made so far. If there’s anything missing, or in error, feel free to contribute!
Mateusz posted a link to an interesting topic on osgeo-discuss. I think it’s a great idea to document the history of geospatial and open source, and I echo Dave’s comments on how Wikipedia would be an ideal home for documentation and maintenance.
Perhaps the best way to go about this would be for the various projects on Wikipedia (MapServer, GDAL, GeoTools, GRASS, etc.) to document their respective histories, and allow the main Wikipedia OSGeo page to link to them accordingly.
Thoughts? Are there better alternatives than Wikipedia? Should projects document history on their own respective websites, which Wikipedia then references
I was in a REST/Web2.0 workshop, and someone asked how REST, since through HTTP, which is a stateless protocol, is any faster than other, or previous approaches.
I’m not sure that REST does anything to speed up HTTP’s request/response mechanisms; but using AJAX surely enhances the user experience with perceived responsiveness given the nature of AJAX by doing things asynchronously.
Or is there more to it?
Modified: 21 July 2008 16:50:07 EST