Agent skill

design-postgis-tables

Comprehensive PostGIS spatial table design reference covering geometry types, coordinate systems, spatial indexing, and performance patterns for location-based applications

Stars 1,661
Forks 82

Install this agent skill to your Project

npx add-skill https://github.com/timescale/pg-aiguide/tree/main/skills/design-postgis-tables

SKILL.md

PostGIS Spatial Table Design

Before You Start (5 Questions)

  1. What is the geographic scope (single city/region vs global)?
  2. What are your primary query patterns (within-radius, bbox, intersects, nearest-neighbor)?
  3. What units do you need for distance/area (meters vs CRS units), and how accurate must they be?
  4. What is the expected scale (rows, write rate), and is the data mostly append-only?
  5. Do you need 3D (Z) or measures (M), or is 2D enough?

SQL injection note: When turning these patterns into application code, use parameterized queries for user-provided values (WKT/WKB, coordinates, IDs, radii). Avoid string-concatenating untrusted input into SQL; for dynamic identifiers, use safe identifier quoting/whitelisting.

Core Rules

  • Always use PostGIS geometry/geography types instead of PostgreSQL's built-in geometric types (POINT, LINE, POLYGON, CIRCLE). PostGIS types provide true spatial capabilities.
  • Choose between GEOMETRY and GEOGRAPHY based on your use case: GEOMETRY for projected/local data with Cartesian math; GEOGRAPHY for global data requiring accurate spherical calculations.
  • Always specify SRID (Spatial Reference Identifier) when creating geometry columns. Use 4326 (WGS84) for GPS/global data, appropriate local projections for regional data.
  • Create spatial indexes on all geometry/geography columns using GiST (default). Consider BRIN only for very large GEOMETRY tables where rows are naturally ordered on disk and you can tolerate coarser filtering.
  • Use constraint-based type enforcement with GEOMETRY(type, SRID) syntax to ensure data integrity.

Geometry vs Geography

When to Use GEOMETRY

  • Local/regional data within a single coordinate system
  • Projected coordinates (meters, feet) for accurate area/distance calculations
  • Complex spatial operations (buffering, unions, intersections)
  • Performance-critical queries (Cartesian math is faster)
  • Data already in a projected CRS (UTM, State Plane, etc.)
sql
-- Regional data with projected coordinates (UTM Zone 10N for California)
CREATE TABLE local_parcels (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    parcel_number TEXT NOT NULL,
    boundary GEOMETRY(POLYGON, 26910),  -- UTM Zone 10N (meters)
    area_sqm DOUBLE PRECISION GENERATED ALWAYS AS (ST_Area(boundary)) STORED
);

When to Use GEOGRAPHY

  • Global data spanning multiple continents/hemispheres
  • GPS coordinates (latitude/longitude in decimal degrees)
  • Accurate distance calculations on Earth's surface (great circle)
  • Simple spatial operations (distance, containment)
  • Data from GPS devices, geocoding services, or web maps
sql
-- Global data with geodetic calculations
CREATE TABLE global_offices (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT NOT NULL,
    city TEXT NOT NULL,
    location GEOGRAPHY(POINT, 4326)  -- WGS84 (lat/lon)
);

-- Distance in meters (accurate spherical calculation)
SELECT
    a.name AS office_a,
    b.name AS office_b,
    ST_Distance(a.location, b.location) / 1000 AS distance_km
FROM global_offices a
CROSS JOIN global_offices b
WHERE a.id < b.id;

Comparison Table

Aspect GEOMETRY GEOGRAPHY
Coordinate system Any SRID (projected or geodetic) WGS84 (SRID 4326) only
Distance units CRS units (degrees, meters, feet) Meters (always)
Distance accuracy Depends on projection True spheroidal distance
Area accuracy Accurate in projected CRS Accurate on sphere
Function support Full (300+ functions) Limited (~40 functions)
Performance Faster (Cartesian math) Slower (spherical math)
Index type GiST, BRIN, SP-GiST GiST only
Best for Regional/local data, complex analysis Global data, GPS tracking

Geometry Types

Point Types

sql
-- Single location (stores, sensors, events)
location GEOMETRY(POINT, 4326)

-- Multiple discrete locations (multi-branch business)
locations GEOMETRY(MULTIPOINT, 4326)

-- 3D point with elevation
location_3d GEOMETRY(POINTZ, 4326)

-- Point with measure value (linear referencing)
location_m GEOMETRY(POINTM, 4326)

Use POINT for: Store locations, sensor positions, event coordinates, addresses, POIs Use MULTIPOINT for: Multiple related locations stored as single feature

Line Types

sql
-- Single path (road segment, river, route)
path GEOMETRY(LINESTRING, 4326)

-- Multiple paths (road network, transit lines)
network GEOMETRY(MULTILINESTRING, 4326)

-- 3D line with elevation profile
trail_3d GEOMETRY(LINESTRINGZ, 4326)

Use LINESTRING for: Roads, rivers, pipelines, GPS tracks, routes Use MULTILINESTRING for: Disconnected road segments, river systems

Polygon Types

sql
-- Single area (parcel, building footprint, zone)
boundary GEOMETRY(POLYGON, 4326)

-- Multiple areas (archipelago, fragmented habitat)
territories GEOMETRY(MULTIPOLYGON, 4326)

-- 3D polygon (building with height)
footprint_3d GEOMETRY(POLYGONZ, 4326)

Use POLYGON for: Property boundaries, administrative areas, service zones Use MULTIPOLYGON for: Countries with islands, fragmented regions

Generic Types

sql
-- Any geometry type (flexible schema)
geom GEOMETRY(GEOMETRY, 4326)

-- Collection of mixed types
features GEOMETRY(GEOMETRYCOLLECTION, 4326)

Use GEOMETRY for: Flexible schemas accepting multiple types Avoid GEOMETRYCOLLECTION: Prefer homogeneous types for better indexing

Coordinate Systems (SRID)

Common SRIDs

SRID Name Use Case Units
4326 WGS84 GPS, global data, web maps Degrees
3857 Web Mercator Web map tiles (display only) Meters
26910-26919 UTM Zones (US) Regional analysis Meters
32601-32660 UTM Zones (North) Regional analysis Meters
32701-32760 UTM Zones (South) Regional analysis Meters

SRID Best Practices

  • Store in WGS84 (4326) for interoperability and GPS data
  • Transform to projected CRS for accurate measurements
  • Never mix SRIDs in spatial operations without explicit transformation
  • Use appropriate local CRS for area/distance calculations requiring high precision
sql
-- Store in WGS84, calculate in UTM
CREATE TABLE survey_points (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    location GEOMETRY(POINT, 4326),  -- Storage: WGS84
    CONSTRAINT valid_location CHECK (ST_IsValid(location))
);

-- Calculate distance in meters using UTM projection
SELECT
    a.id AS point_a,
    b.id AS point_b,
    ST_Distance(
        ST_Transform(a.location, 26910),  -- Transform to UTM
        ST_Transform(b.location, 26910)
    ) AS distance_meters
FROM survey_points a
CROSS JOIN survey_points b
WHERE a.id < b.id;

Spatial Indexing

GiST Index (Default)

Most versatile spatial index. Use for all geometry/geography columns.

sql
-- Geometry (most common)
CREATE INDEX idx_your_table_geom_gist ON your_table_name USING GIST (geom);

-- Geography (GiST is the supported option)
CREATE INDEX idx_your_table_geog_gist ON your_table_name USING GIST (geog);

-- Analyze after index creation
VACUUM ANALYZE your_table_name;

Supports: All spatial operators (&&, @>, <@, ~=, <->) Best for: General-purpose spatial queries, mixed query patterns

BRIN Index

Block Range Index for very large, naturally ordered datasets.

sql
-- BRIN for very large, append-only GEOMETRY tables (geography uses GiST)
CREATE INDEX idx_your_table_geom_brin
    ON your_table_name
    USING BRIN (geom)
    WITH (pages_per_range = 128);

Supports: Bounding box operators (&&, @>, <@) Best for: Append-only tables, time-series spatial data, very large datasets (>100M rows) Trade-off: Much smaller than GiST, but less precise filtering

SP-GiST Index

Space-partitioned GiST for point data with specific distributions.

sql
-- SP-GiST for GEOMETRY(POINT, ...) only
CREATE INDEX idx_sensors_location_spgist
    ON sensors
    USING SPGIST (location);

Best for: Point-only data, quadtree-friendly distributions Not for: Complex geometries, mixed types

Index Selection Guide

Scenario Index Type Reasoning
General spatial queries GiST Most versatile, supports all operators
Very large, append-only BRIN Tiny footprint, good for time-ordered data
Point-only, uniform distribution SP-GiST Efficient for point lookups
Geography columns GiST Only supported option
Composite spatial + attribute GiST + B-tree Separate indexes or expression index

Table Design Examples

Points of Interest (POI)

sql
CREATE TABLE pois (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT NOT NULL,
    category TEXT NOT NULL,
    location GEOGRAPHY(POINT, 4326) NOT NULL,
    address TEXT,
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
    CONSTRAINT valid_category CHECK (category IN (
        'restaurant', 'hotel', 'gas_station', 'hospital', 'school'
    ))
);

-- Spatial index
CREATE INDEX idx_pois_location ON pois USING GIST (location);

-- Category + location for filtered spatial queries
CREATE INDEX idx_pois_category ON pois (category);

-- Find restaurants within 1km
SELECT name, address,
       ST_Distance(
         location,
         ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::GEOGRAPHY
       ) AS distance_m
FROM pois
WHERE category = 'restaurant'
  AND ST_DWithin(
    location,
    ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326)::GEOGRAPHY,
    1000
  )
ORDER BY distance_m;

Property Parcels

sql
CREATE TABLE parcels (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    parcel_id TEXT NOT NULL UNIQUE,
    owner_name TEXT,
    boundary GEOMETRY(MULTIPOLYGON, 4326) NOT NULL,
    centroid GEOMETRY(POINT, 4326) GENERATED ALWAYS AS (ST_Centroid(boundary)) STORED,
    area_sqm DOUBLE PRECISION GENERATED ALWAYS AS (
        ST_Area(boundary::GEOGRAPHY)
    ) STORED,
    perimeter_m DOUBLE PRECISION GENERATED ALWAYS AS (
        ST_Perimeter(boundary::GEOGRAPHY)
    ) STORED,
    CONSTRAINT valid_boundary CHECK (ST_IsValid(boundary)),
    CONSTRAINT closed_boundary CHECK (ST_IsClosed(ST_ExteriorRing(ST_GeometryN(boundary, 1))))
);

CREATE INDEX idx_parcels_boundary ON parcels USING GIST (boundary);
CREATE INDEX idx_parcels_centroid ON parcels USING GIST (centroid);

-- Find parcels intersecting a search area
SELECT parcel_id, owner_name, area_sqm
FROM parcels
WHERE ST_Intersects(boundary, ST_MakeEnvelope(-122.5, 37.7, -122.4, 37.8, 4326));

GPS Tracking

sql
CREATE TABLE gps_tracks (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    device_id TEXT NOT NULL,
    recorded_at TIMESTAMPTZ NOT NULL,
    location GEOGRAPHY(POINT, 4326) NOT NULL,
    speed_kmh DOUBLE PRECISION,
    heading DOUBLE PRECISION,
    accuracy_m DOUBLE PRECISION
);

-- Composite index for device + time queries
CREATE INDEX idx_gps_device_time ON gps_tracks (device_id, recorded_at DESC);

-- Spatial index for location queries
CREATE INDEX idx_gps_location ON gps_tracks USING GIST (location);

-- Note: GEOGRAPHY supports GiST; BRIN is for GEOMETRY (when appropriate).

-- Create linestring from track points
SELECT
    device_id,
    ST_MakeLine(location::GEOMETRY ORDER BY recorded_at) AS track_line,
    MIN(recorded_at) AS start_time,
    MAX(recorded_at) AS end_time
FROM gps_tracks
WHERE device_id = 'device_001'
  AND recorded_at >= '2024-01-01'
GROUP BY device_id;

Service Areas / Coverage Zones

sql
CREATE TABLE service_zones (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    zone_name TEXT NOT NULL,
    zone_type TEXT NOT NULL,
    boundary GEOMETRY(POLYGON, 4326) NOT NULL,
    population INTEGER,
    active BOOLEAN NOT NULL DEFAULT true,
    CONSTRAINT valid_zone_type CHECK (zone_type IN ('delivery', 'service', 'coverage')),
    CONSTRAINT valid_boundary CHECK (ST_IsValid(boundary))
);

CREATE INDEX idx_zones_boundary ON service_zones USING GIST (boundary);
CREATE INDEX idx_zones_active ON service_zones (active) WHERE active = true;

-- Check if location is within any active service zone
SELECT zone_name, zone_type
FROM service_zones
WHERE active = true
  AND ST_Contains(boundary, ST_SetSRID(ST_MakePoint(-122.4194, 37.7749), 4326));

Performance Patterns

Use ST_DWithin Instead of ST_Distance

sql
-- SLOW: calculates distance for all rows
SELECT * FROM pois
WHERE ST_Distance(location, ref_point) < 1000;

-- FAST: uses spatial index
SELECT * FROM pois
WHERE ST_DWithin(location, ref_point, 1000);

Use && for Bounding Box Pre-filtering

sql
-- Bounding box operator leverages spatial index
SELECT * FROM parcels
WHERE boundary && ST_MakeEnvelope(-122.5, 37.7, -122.4, 37.8, 4326)
  AND ST_Intersects(boundary, search_polygon);

Avoid Functions on Indexed Columns

sql
-- SLOW: function prevents index usage
SELECT * FROM parcels WHERE ST_Area(boundary) > 10000;

-- FAST: use generated column with regular index
ALTER TABLE parcels ADD COLUMN area_sqm DOUBLE PRECISION
    GENERATED ALWAYS AS (ST_Area(boundary::GEOGRAPHY)) STORED;
CREATE INDEX idx_parcels_area ON parcels (area_sqm);
SELECT * FROM parcels WHERE area_sqm > 10000;

Simplify Geometries for Display

sql
-- Reduce complexity for web display (tolerance in CRS units)
SELECT
    id,
    name,
    ST_AsGeoJSON(ST_Simplify(boundary, 0.0001)) AS geojson
FROM parcels;

Use Appropriate Precision

sql
-- Reduce coordinate precision for storage efficiency
UPDATE locations SET geom = ST_ReducePrecision(geom, 0.000001);

-- GeoJSON with limited decimal places
SELECT ST_AsGeoJSON(location, 6) AS geojson FROM pois;

Data Validation

Geometry Validity Checks

sql
-- Add validity constraint
ALTER TABLE parcels ADD CONSTRAINT valid_geom CHECK (ST_IsValid(boundary));

-- Find and fix invalid geometries
SELECT id, ST_IsValidReason(boundary) AS reason
FROM parcels
WHERE NOT ST_IsValid(boundary);

-- Attempt to fix invalid geometries
UPDATE parcels
SET boundary = ST_MakeValid(boundary)
WHERE NOT ST_IsValid(boundary);

SRID Consistency

sql
-- Verify SRID consistency
SELECT DISTINCT ST_SRID(geom) FROM spatial_table;

-- Enforce SRID with constraint
ALTER TABLE locations ADD CONSTRAINT enforce_srid
    CHECK (ST_SRID(location) = 4326);

Coordinate Range Validation

sql
-- Ensure coordinates are within valid WGS84 bounds
ALTER TABLE global_locations ADD CONSTRAINT valid_coords CHECK (
    ST_X(location::GEOMETRY) BETWEEN -180 AND 180 AND
    ST_Y(location::GEOMETRY) BETWEEN -90 AND 90
);

Do Not Use

  • PostgreSQL built-in types (POINT, LINE, POLYGON, CIRCLE) - use PostGIS types instead
  • SRID 0 (undefined) - always specify the correct SRID
  • ST_Distance for filtering - use ST_DWithin for index-supported distance queries
  • Mixed SRIDs in operations - always transform to common SRID first
  • GEOGRAPHY for complex analysis - use GEOMETRY with appropriate projection
  • Over-precise coordinates - GPS accuracy is ~3-5m, 6 decimal places (0.1m) is sufficient

Common Pitfalls

  1. Longitude/Latitude order: PostGIS uses (longitude, latitude) = (X, Y), not (lat, lon)
  2. GEOGRAPHY distance units: Always in meters, regardless of display
  3. Index not used: Run EXPLAIN ANALYZE to verify spatial index usage
  4. Transform performance: Cache transformed geometries for repeated queries
  5. Large geometries: Consider ST_Subdivide for very complex polygons
  6. SQL injection / unsafe dynamic SQL: Don't concatenate untrusted input into SQL. Parameterize values; for dynamic identifiers use safe quoting (quote_ident, format('%I', ...)) or strict allowlists.

Expand your agent's capabilities with these related and highly-rated skills.

timescale/pg-aiguide

setup-timescaledb-hypertables

Use this skill when creating database schemas or tables for Timescale, TimescaleDB, TigerData, or Tiger Cloud, especially for time-series, IoT, metrics, events, or log data. Use this to improve the performance of any insert-heavy table. **Trigger when user asks to:** - Create or design SQL schemas/tables AND Timescale/TimescaleDB/TigerData/Tiger Cloud is available - Set up hypertables, compression, retention policies, or continuous aggregates - Configure partition columns, segment_by, order_by, or chunk intervals - Optimize time-series database performance or storage - Create tables for sensors, metrics, telemetry, events, or transaction logs **Keywords:** CREATE TABLE, hypertable, Timescale, TimescaleDB, time-series, IoT, metrics, sensor data, compression policy, continuous aggregates, columnstore, retention policy, chunk interval, segment_by, order_by Step-by-step instructions for hypertable creation, column selection, compression policies, retention, continuous aggregates, and indexes.

1,661 82
Explore
timescale/pg-aiguide

postgres-hybrid-text-search

Use this skill to implement hybrid search combining BM25 keyword search with semantic vector search using Reciprocal Rank Fusion (RRF). **Trigger when user asks to:** - Combine keyword and semantic search - Implement hybrid search or multi-modal retrieval - Use BM25/pg_textsearch with pgvector together - Implement RRF (Reciprocal Rank Fusion) for search - Build search that handles both exact terms and meaning **Keywords:** hybrid search, BM25, pg_textsearch, RRF, reciprocal rank fusion, keyword search, full-text search, reranking, cross-encoder Covers: pg_textsearch BM25 index setup, parallel query patterns, client-side RRF fusion (Python/TypeScript), weighting strategies, and optional ML reranking.

1,661 82
Explore
timescale/pg-aiguide

design-postgres-tables

Use this skill for general PostgreSQL table design. **Trigger when user asks to:** - Design PostgreSQL tables, schemas, or data models when creating new tables and when modifying existing ones. - Choose data types, constraints, or indexes for PostgreSQL - Create user tables, order tables, reference tables, or JSONB schemas - Understand PostgreSQL best practices for normalization, constraints, or indexing - Design update-heavy, upsert-heavy, or OLTP-style tables **Keywords:** PostgreSQL schema, table design, data types, PRIMARY KEY, FOREIGN KEY, indexes, B-tree, GIN, JSONB, constraints, normalization, identity columns, partitioning, row-level security Comprehensive reference covering data types, indexing strategies, constraints, JSONB patterns, partitioning, and PostgreSQL-specific best practices.

1,661 82
Explore
timescale/pg-aiguide

find-hypertable-candidates

Use this skill to analyze an existing PostgreSQL database and identify which tables should be converted to Timescale/TimescaleDB hypertables. **Trigger when user asks to:** - Analyze database tables for hypertable conversion potential - Identify time-series or event tables in an existing schema - Evaluate if a table would benefit from Timescale/TimescaleDB - Audit PostgreSQL tables for migration to Timescale/TimescaleDB/TigerData - Score or rank tables for hypertable candidacy **Keywords:** hypertable candidate, table analysis, migration assessment, Timescale, TimescaleDB, time-series detection, insert-heavy tables, event logs, audit tables Provides SQL queries to analyze table statistics, index patterns, and query patterns. Includes scoring criteria (8+ points = good candidate) and pattern recognition for IoT, events, transactions, and sequential data.

1,661 82
Explore
timescale/pg-aiguide

migrate-postgres-tables-to-hypertables

Use this skill to migrate identified PostgreSQL tables to Timescale/TimescaleDB hypertables with optimal configuration and validation. **Trigger when user asks to:** - Migrate or convert PostgreSQL tables to hypertables - Execute hypertable migration with minimal downtime - Plan blue-green migration for large tables - Validate hypertable migration success - Configure compression after migration **Prerequisites:** Tables already identified as candidates (use find-hypertable-candidates first if needed) **Keywords:** migrate to hypertable, convert table, Timescale, TimescaleDB, blue-green migration, in-place conversion, create_hypertable, migration validation, compression setup Step-by-step migration planning including: partition column selection, chunk interval calculation, PK/constraint handling, migration execution (in-place vs blue-green), and performance validation queries.

1,661 82
Explore
timescale/pg-aiguide

pgvector-semantic-search

Use this skill for setting up vector similarity search with pgvector for AI/ML embeddings, RAG applications, or semantic search. **Trigger when user asks to:** - Store or search vector embeddings in PostgreSQL - Set up semantic search, similarity search, or nearest neighbor search - Create HNSW or IVFFlat indexes for vectors - Implement RAG (Retrieval Augmented Generation) with PostgreSQL - Optimize pgvector performance, recall, or memory usage - Use binary quantization for large vector datasets **Keywords:** pgvector, embeddings, semantic search, vector similarity, HNSW, IVFFlat, halfvec, cosine distance, nearest neighbor, RAG, LLM, AI search Covers: halfvec storage, HNSW index configuration (m, ef_construction, ef_search), quantization strategies, filtered search, bulk loading, and performance tuning.

1,661 82
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results