Thursday 8 October 2015

MapGuide tidbits: Large directories of SHP files

Did you know that the SHP FDO provider included with MapGuide Open Source 3.0 now supports the family of FDO APIs introduced with RFC 23?

What this means is if you have a Feature Source that connects to a directory of SHP files, you should see dramatic performance improvements in walking schemas and class definitions of that feature source if that directory contained lots of SHP files. How does this work?

  • GetSchemaNames only has to return the schema name which will always be "Default". No disk I/O necessary
  • GetClassNames only has to return a class name list based on the file listing of the connected directory as each Feature Class name is basically the name of the SHP file (without the .shp extension). Previously, this would build the FDO logical schema from every SHP file in that directory and then return the class name list of that. Building that logical schema means connecting and inspecting every SHP file. For a directory with lots of SHP files, that takes a lot of time.
  • DescribeSchema with class name hint just means instead of constructing the FDO logical schema of every SHP file in that directory, we just do it for the list of SHP files indicated by the class name list (once again, because SHP class name = SHP file name)
  • GetClassDefinition does not require a full DescribeSchema first, as the parameters passed to it now contain enough information to know which physical SHP file to inspect and create the necessary class definition from it.
In practical terms, the performance gains will be most apparent on:
  • Editing SHP-based Layers in Maestro because the schema/class listing operations are now super cheap and fast.
  • Any code in your application or MapGuide Server that needs to access a single class definition from a feature source. In the past without these APIs supported, a full schema walk on the SHP feature source was required first in order to pluck out the necessary class definition from within it. This would be most noticeable performance wise when a request for a given class is made the first time. Subsequent times would be near instant due to caching, but for directories of many SHP files that first request could take a while.