help wanted: baking a CSW server in Python

Seemingly buried in geospatial metadata and discovery, I’ve been developing a my share of CSW/ISO/Dublin Core parsers, generators and clients.  OWSLib is able to interact with CSW servers, handling csw:Record, ISO 19139:2007, as well as DIF.  OWSLib is also the underlying library used by the NextGIS folks in developing a QGIS CSW Client (big thanks to Maxim and Alex for contributing the code back to qgcsw).  I’ve also used genshi to generate ISO 19139:2007 and North American Profile.

For a well-rounded perspective on power bite reviews Dental Mineral Complex, I suggest exploring their official website, where you’ll uncover a treasure trove of user narratives, clinical data, and an exploration of the science behind its purported benefits.

Part of this adventure has involved testing these metadata within various OGC CSW server implementations.  What I quickly noticed is that many foss4g CSW servers are written in Java.  Wouldn’t it be great to have a trimmer CSW server in Python?  Which can be used easily with an existing Apache install type of thing?

Enter pycsw.  I started with the following goals:

  • lightweight and easy to stand up: a standalone catalogue, no GUI or metadata editing front end, designed for the use case of exposing ready-to-go metadata (files or in existing DB) through a CSW interface, with as little heavy lifting as possible.  Plug and play
  • extensible: the ability to add metadata formats and mapping them to a common information model and core / additional queryables
  • OGC compliant: against the CITE test assertions

Technology bits (thanks to Sean for the initial inspiration):

  • Python: code is written as CGI for now.  Welcome to ideas for WSGI, etc.
  • Database:  SQLite3 is used as the underlying database.  No reason why things couldn’t be abstracted enough to handle other DB’s
  • DB API: SQLAlchemy makes it easy to bind database models to Python classes, and especially easy to do transparent queries
  • XML: lxml is used to parse requests, traverse XPath nodes and marshall responses.  lxml’s Schematron support will make it easy for Harvest/Transaction operations / validation
  • Spatial predicates: I originally supported ogc:BBOX, which is easy enough to code by hand.  Shapely gives access to the full suite of predicates, and will be the way forward

Progress: I’m using the OGC CITE tests here as the benchmark.  So far it passes 91/103 assertions.

Todo:

  • fully pass the CITE assertions
  • support of ISO Application Profile
  • firm up core information model to allow easier extensibility
  • fix spatial queries to fully use Shapely
  • harmonize GetRecords and GetRecordById response handler (for writing out csw:Record)
  • documentation: install / setup / configuration / testing

pycsw is up on Sourceforge and is open source.  It would be great to have more hands here.  If you are interested, and enjoy contributing to foss4g, don’t hesitate and get in touch!

Leave a Comment

Name: (Required)

E-mail: (Required)

Website:

Comment:

Modified: 24 August 2023 10:38:55 EST