How to Validate GeoPackage OGC Compliance

To validate GeoPackage OGC compliance, run schema-level checks against the mandatory gpkg metadata tables using Python’s sqlite3 module, then verify…

To validate GeoPackage OGC compliance, run schema-level checks against the mandatory gpkg_* metadata tables using Python’s sqlite3 module, then verify geometry encoding and spatial reference integrity via GDAL’s ogrinfo or the official OGC reference validator. Compliance requires strict adherence to the OGC 18-000r4 specification, including correct extension registration, valid gpkg_contents entries, and properly formatted WKB geometry blobs. For automated pipelines, wrap these checks in a Python validation script that returns structured pass/fail reports before files reach field devices or sync endpoints.

Why Compliance Validation Matters for Offline-First Stacks

Field GIS technicians and mobile developers treat GeoPackage as a drop-in replacement for shapefiles, but its SQLite foundation enforces strict structural rules. When a file violates the Core Architecture & Format Standards for Spatial SQLite, mobile SQLite drivers often fail silently. This leads to corrupted spatial indexes, dropped geometry columns, or sync failures in QField, ArcGIS Field Maps, and custom React Native/Flutter clients. Validating at ingestion prevents cascading data loss in offline-first workflows and ensures cross-platform interoperability.

Primary Validation Method: Python + GDAL/OGC Toolchain

The most reliable approach combines native SQLite schema inspection with GDAL’s geometry validation routines. The script below checks mandatory OGC tables, validates gpkg_contents data types, verifies spatial reference bindings, and delegates WKB integrity checks to ogrinfo.

python
import sqlite3
import subprocess
import sys
import json
from pathlib import Path

def validate_geopackage(gpkg_path: str) -> dict:
    """Validate OGC GeoPackage compliance using SQLite schema checks + GDAL."""
    path = Path(gpkg_path)
    if not path.exists():
        return {"status": "FAIL", "error": "File not found"}

    report = {"file": str(path), "compliant": True, "checks": []}

    try:
        with sqlite3.connect(str(path)) as conn:
            conn.row_factory = sqlite3.Row
            cursor = conn.cursor()

            # 1. Mandatory OGC Table Presence
            mandatory_tables = {
                "gpkg_spatial_ref_sys", "gpkg_contents", 
                "gpkg_geometry_columns", "gpkg_extensions"
            }
            cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
            existing = {row["name"] for row in cursor.fetchall()}
            missing = mandatory_tables - existing
            if missing:
                report["compliant"] = False
                report["checks"].append({"check": "mandatory_tables", "status": "FAIL", "missing": sorted(missing)})
            else:
                report["checks"].append({"check": "mandatory_tables", "status": "PASS"})

            # 2. gpkg_contents Validation
            cursor.execute("SELECT table_name, data_type, srs_id FROM gpkg_contents;")
            contents = cursor.fetchall()
            if not contents:
                report["compliant"] = False
                report["checks"].append({"check": "gpkg_contents", "status": "FAIL", "message": "Empty contents table"})
            else:
                valid_types = {"features", "attributes", "tiles"}
                invalid = [r["data_type"] for r in contents if r["data_type"] not in valid_types]
                if invalid:
                    report["compliant"] = False
                    report["checks"].append({"check": "gpkg_contents", "status": "FAIL", "invalid_types": invalid})
                else:
                    report["checks"].append({"check": "gpkg_contents", "status": "PASS"})

            # 3. Spatial Reference Integrity
            cursor.execute("SELECT srs_id, organization, organization_coordsys_id FROM gpkg_spatial_ref_sys WHERE srs_id != 0;")
            srs_rows = cursor.fetchall()
            if not srs_rows:
                report["checks"].append({"check": "spatial_ref_sys", "status": "WARN", "message": "Only undefined SRS (0) found"})
            else:
                report["checks"].append({"check": "spatial_ref_sys", "status": "PASS", "count": len(srs_rows)})

    except sqlite3.Error as e:
        report["compliant"] = False
        report["checks"].append({"check": "sqlite_connection", "status": "FAIL", "error": str(e)})
        return report

    # 4. GDAL Geometry & WKB Validation (ogrinfo)
    try:
        result = subprocess.run(
            ["ogrinfo", "-so", "-al", str(path)],
            capture_output=True, text=True, check=True, timeout=30
        )
        if "ERROR" in result.stderr.upper():
            report["compliant"] = False
            report["checks"].append({"check": "gdal_geometry", "status": "FAIL", "details": result.stderr.strip()})
        else:
            report["checks"].append({"check": "gdal_geometry", "status": "PASS"})
    except FileNotFoundError:
        report["checks"].append({"check": "gdal_geometry", "status": "SKIP", "message": "GDAL/ogrinfo not installed"})
    except subprocess.TimeoutExpired:
        report["checks"].append({"check": "gdal_geometry", "status": "FAIL", "message": "Validation timed out"})
    except subprocess.CalledProcessError as e:
        report["checks"].append({"check": "gdal_geometry", "status": "FAIL", "details": e.stderr.strip()})

    return report

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python validate_gpkg.py <path_to_gpkg>")
        sys.exit(1)
    print(json.dumps(validate_geopackage(sys.argv[1]), indent=2))

Step-by-Step Execution & Pipeline Integration

  1. Install Dependencies: Ensure Python 3.8+ and GDAL are available. On Ubuntu: sudo apt install gdal-bin python3-sqlite3. On macOS: brew install gdal.
  2. Run Locally: Execute python validate_gpkg.py data/offline_survey.gpkg. The script outputs a structured JSON report.
  3. CI/CD Gate: Integrate the script into GitHub Actions, GitLab CI, or Airflow DAGs. Parse the compliant boolean to block merges or quarantine non-compliant files before they sync to edge devices.
  4. Batch Processing: Wrap the function in a concurrent.futures.ThreadPoolExecutor for directory-level validation. GeoPackage files are single-file SQLite databases, making them safe for concurrent read-only checks.

Common Compliance Failures & Fixes

Failure SymptomRoot CauseResolution
missing: ['gpkg_extensions']File created via non-OGC SQLite dump or legacy toolingRe-export using ogr2ogr -f GPKG or run gpkg_add_extension()
invalid_types: ['unknown']Custom table registered without proper data_type enumUpdate gpkg_contents.data_type to features, attributes, or tiles
gdal_geometry: FAILCorrupted WKB blobs or mismatched SRS IDsRun ogrinfo -sql "SELECT * FROM gpkg_geometry_columns" to isolate the layer, then rebuild with ST_Force2D() or ST_Transform()
Silent mobile crashesMissing gpkg_spatial_ref_sys entries for custom EPSG codesInject standard SRS definitions before deployment. Mobile SQLite drivers require explicit SRS rows for non-zero srs_id values.

Advanced Validation: OGC Reference Validator

For regulatory or enterprise deployments, Python/GDAL checks should be supplemented with the official OGC GeoPackage Validator. This Java-based tool executes the full conformance test suite, including edge-case extension compliance, tile matrix validation, and strict metadata schema enforcement. You can download the validator and run it via CLI: java -jar geopackage-validator.jar -f data/offline_survey.gpkg.

Pairing automated Python checks with the OGC validator covers 99% of production failure modes. For deeper architectural context, review the GeoPackage Specification Deep Dive to understand how spatial indexes, extension chaining, and transaction isolation interact at the SQLite level. Always cross-reference your validation logic against the official OGC GeoPackage Standard and GDAL GeoPackage Driver Documentation to stay aligned with spec revisions and driver updates.