RFC5 - Enhanced data limit handling

  • date: 2024-11-19
  • author: Tom Kralidis
  • contact: tomkralidis@gmail.com
  • status: draft
  • modified: 2025-01-14

Overview

This RFC describes enhanced limit handling implementation in support of better definition and control on data query limiting.

Currently, pygeoapi defines a server.limit configuration directive which corresponds to the limit query parameter as part of the OGC API - Features and OGC API - Records .../items endpoint. The following scenarios are missing from this setting:

  • the ability for the server to provide a default limit when the user does not explicitly specify a limit
  • the ability for the server to provide a maximum limit when the user specifies a limit that is deemed too large by the server to process
  • given this setting is defined at the server level, it may not be suitable for all datasets in a given configuration
  • limits currently only apply to vector data

Proposed solution

pygeoapi will define a limits configuration parameter that will allow a user to define default and maximum limits for multiple data types. This parameter will be defined at the server level (server.limits) with the ability to override at resource level (resources[*].limits). An example of this setting is shown below:

limits:
    default_items: 20  # applies to vector data
    max_items: 500  # applies to vector data
    max_distance_x: 123  # applies to all datasets
    max_distance_y: 456 # applies to all datasets
    max_distance_units: m  # as per UCUM https://ucum.org/ucum#section-Tables-of-Terminal-Symbols
    on_exceed: error  # one of error, throttle

The limits setting will be applied as follows:

  • pygeoapi administrator is able to use at both the server and resources levels, with resources limits overriding server wide limits settings
  • no limit specified by client: use limits.default_items to set the result set size
  • limit specified by client: calculate the minimum of the query parameter and limits.max_items to set the result set size
  • bbox or spatial dimensions: compare distance of request to maximum definition allowed in limits.max_distance_x and limits.max_distance_y

Implementation

  • an evaluate_limit function will be added to the pygeoapi module for use by pygeoapi.api accordingly
  • if limits.on_exceed is error, then pygeoapi will throw HTTP 413

Backwards Compatibility Issues

The limits setting fully replaces server.limit and represents a compatibility break. The new functionality will not be backported to other branches.

Testing

Tests will be added to ensure the expected functionality.

Documentation

Documentation will be added to the configuration description page.

Issue and Pull Request tracking

Issue: https://github.com/geopython/pygeoapi/issues/1856

Pull Request: https://github.com/geopython/pygeoapi/pull/1892

Voting History

Adopted on 2025-01-14 with +1 from jorgejesus, tomkralidis, kalxas