PostGIS

I plan on using PostGIS to manage geography objects in PostgreSQL, so I want to review the docs here to learn best practices and more about the extension.

Date Created:

Last Edited:

2 526

References

PostGIS Documentation
Geospatial Data and GeoJSON
QGIS
- A desktop GIS viewer/editor for quickly looking at geographic data.
Simple Features for SQL
- A model that is a non-topological way to store geospatial data in a database
The GeoJSON Specification
Link to Census Database Files

Review of Geospatial Data

Things to know about geospatial data:

Geospatial data consists of two components, location and attribute.

- When you use Google Maps to search for a restaurant, you see a red marker on the screen that points to its location in latitude and longitude coordinates, such as 41.7620891, -72.6856295. Attributes include additional information such as the restaurant name, its human-friendly street address, and guest review comments. All of these attributes add value to your location data.

Geospatial data can be raster or vector.

- In digital maps, raster data often appears as satellite and aerial images, and the quality depends on the resolution of the camera that captured them. Raster data is like a grid of cells.
- Vector data appears on digital maps as pictorial images of buildings, rivers, and regions. Vector maps can be created by humans or algorithms when they draw points, polylines, and polygons from raster satellite or aerial images, or from devices such as GPS trackers that record runs or hikes, or from other sources.
- Unlike raster maps, vector maps remain sharply focused at any zoom level, because every point and line is represented by latitude and longitude coordinates, which can be expressed with precise decimals.
- While raster data is generally limited to one value per cell (such as color for traditional satellite images, or height above sea level for digital elevation models), vector data can contain multiple attributes about each object (such as its name, street address, and comments).
- Moreover, vector map files tend to be smaller in size than raster ones.

Vector vs Raster Data

Common Vector Data File Formats

GeoJSON

GeoJSON is a popular map fata format, based on open-standard created in 2016, with file extensions that end with .geojson or .json
It is strongly recommended that you create and edit map data in GeoJSON format. Storing and sharing your geospatial data in GeoJSON ensures that you and others will be able to use the file without installing bulky or expensive GIS desktop applications.
In GeoJSON format, coordinates are ordered in longitude-latitude format, the same as X-Y coordinates in mathematics. Bu this is the opposite of Google Maps and some other web map tools, which place coordinate values in latitude-longitude format.

Shapefiles

The shapefile format was created in the 1990s by Ersi, the company that develops ArcGIS software. Shapefiles typically appear in a folder of files with extensions such as .shp, .shx, and .dbf, and the folder may be compressed into a .zip file.
Government agencies commonly distribute map data in shapefile format.

GPS Exchange Format (GPX)

GPX (.gpx files) is an open standard and is based on the XML markup langauge. Like GeoJSON, you can inspect the contents of a GPX file in any simple text editor. A GPX file most likely contains a collection of timestamps and latitude/longitude coordinates.

Keyhole Markup Language (or KML)

The KML format rose in popularity during the late 2000s, when it was developed for Google Earth. It was drooped by Google in late 2019. Sometimes kml formats are compressed to kmz format.

MapInfo TAB

The proprietary TAB format is created and supported by MapInfo and is designed to work well with MapInfo Pro GIS software.

Notes

PostGIS extends the capabilities of the PostgreSQL relational database by adding support for storing, indexing, and querying geospatial data.

PostGIS features include:

Spatial Data Storage: Store different types of spatial data such as points, lines, polygons, and multi-geometries, in both 2D and 3D data
Spatial Indexing: Quickly search and retrieve spatial data based on its location.
Spatial Functions: A wise range of spatial functions that allow you to filter and analyze spatial data, measuring distances and area, intersecting geometries, buffering, and more.
Geometry Processing: Tools for processing and manipulating geometry data, such as simplification, conversion, and generalization
Raster Data Support: Storage and processing of raster data, such as elevation data and weather data
Geocoding and Reverse Geocoding: Functions for geocoding and reverse geocoding
Integration: Access and work with PostGIS using third party tools such as QGIS, GeoServer, MapServer, ArcGIS, Tableau

Installation and Upgrade

AWS RDS databases already have PostGIS installed, you just need to enable it in your database:

CREATE EXTENSION postgis;

You can check what version of PostGIS you have installed and what versions are available with the following commands:

SELECT PostGIS_Full_Version();
SELECT * FROM pg_available_extensions WHERE name = 'postgis';

Update PostGIS:

SELECT postgis_extensions_upgrade();

PostGIS Workshop

PostGIS Workshop

Introduction

What is a Spatial Database

PostGIS is a spatial database. Spatial databases store and manipulate spatial objects like any other object in the database.
Three aspects that associate spatial data with a database:
- Spatial data types refer to shapes such as point, line, and polygon
- Multi-dimensional spatial indexing is used for efficient processing of spatial operations
- Spatial functions, posed in SQL, are for querying of spatial properties and relationships

Evolution of GIS Architectures

Spatial Data Types

An ordinary database has strings, numbers, and dates. A spatial database adds additional (spatial) types for representing geographic features. These spatial data types abstract and encapsulate spatial structures such as boundary and dimensions. In many respects, spatial data types can be understood simply as shapes.

Spatial Data Types

In the diagram above, each sub-type inherits the structure (attributes) and the behavior (methods of functions) of its super-type.

Spatial Indexes and Bounding Boxes

Real spatial databases provide a "spatial index" that answers the question which objects are within this particular bounding box?
A bounding box is the smallest rectangle - parallel to the coordinate axes - capable of containing a given feature.

Bounding Box

Indexes have to perform quickly in order to be useful. Instead of providing exact results, spatial indexes provide approximate results.
The actual spatial indexes implemented by various databases vary widely. The most common implementations are the R-Tree and Quadtree (used in PostGIS), but there are also grid-based indexes and GeoHash Indexes implemented in other spatial indexes.

Spatial Functions

For manipulating data during a query, an ordinary database provides functions such as concatenating strings, performing hash operations on strings, doing mathematics on numbers, and extracting information from dates.
A spatial database provides a complete set of functions for analyzing geometric components, determining spatial relationships, and manipulating geometries. These spatial functions serve as the building box for any spatial project.
The majority of spatial functions can be grouped into the following categories:

Conversion: Functions that convert between geometries and external data formats.
Management: Functions that manage information about spatial tables and PostGIS administration.
Retrieval: Functions that retrieve properties and measurements of a Geometry.
Comparison: Functions that compare two geometries with respect to their spatial relation.
Generation: Functions that generate new geometries from others.

PostGIS turns the PostgreSQL Database Management System into a spatial database by adding support for the three features: spatial types, spatial indexes, and spatial functions.

Loading Data

A shapefile commonly refers to a collection of files with .shp, .shx, .dbf, and other extensions on a common prefix name (e.g., nyc_census_blocks). The actual shapefile relates specifically to files with the .shp extension. However, the .shp file alone is incomplete for distribution without the required supporting files.
Mandatory files:
- .shp - shape format; the feature geometry itself
- .shx - shape index format; a positional index of the feature geometry
- .dbf - attribute format; columnar attributes for each shape, in dBase III
Optional files:
- .prj - projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text format
The shp2pgsql converts Shape files into SQL.
An SRID stands for Spatial Reference Identifier. It defines all the parameters of our data's geographic coordinate system and projection. An SRID is convenient because it packs all the information about a map projection (which can be quite complex) into a single number.

Geometries

Metadata Tables

In conformance with the Simple Features for SQL (SFSQL) specification, PostGIS provides two tables to track and report on the geometry types available in a given database.
- The first table, spatial_ref_sys, defines all the spatial reference systems known to the database and will be described in greater detail later.
- The second table (actually, a view), geometry_columns, provides a listing of all "features" (defined as an object with geometric attributes), and the basic details of those features.

Metadata Tables

-- By querying this table, GIS clients and libraries can determine what to expect when retrieving data and 
SELECT * FROM geometry_columns;


`f_table_catalog`	`f_table_schema`	`f_table_name`	`f_geometry_column`	`coord_dimension`	`srid`	`type`
postgres	public	community_location_filter	geojson	2	0	GEOMETRY
postgres	tiger	county	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	state	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	place	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	cousub	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	edges	the_geom	2	4269	MULTILINESTRING
postgres	tiger	addrfeat	the_geom	2	4269	LINESTRING
postgres	tiger	faces	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	zcta5	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	tabblock20	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	tract	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	tabblock	the_geom	2	4269	MULTIPOLYGON
postgres	tiger	bg	the_geom	2	4269	MULTIPOLYGON
postgres	public	census_location	feature_geometry	2	0	GEOMETRY

The f_table_catalog, f_table_schema, and f_table_name provide the fully qualified name of the feature table containing a given geometry. Because PostgreSQL doesn't make use of catalogs, f_table_catalog will tend to be empty.
f_geometry_column is the name of the column that geometry containing column - for feature tables with multiple geometry columns, there will be one record for each.
coord_dimension and srid define the dimension of the geometry (2-, 3-, or 4-dimensional) and the Spatial Reference system identifier that refers to the spatial_ref_sys table respectively.
The type column defined the type of geometry as described below.

Representing Real World Objects

The Simple Features for SQL (SFSQL) specification, the original guiding standard for PostGIS development, defines how a real-world object is represented. By taking a continuous shape and digitizing it at a fixed resolution, we achieve a passable representation of the object.
We can collection information about geometry objects by reading that geometry metadata:

ST_GeometryTyp(geometry) -- returns the type of the geometry
ST_NDims(geometry) -- returns the number of dimensions of the geometry
ST_SRID(geometry) -- returns the spatial reference identifer number of the geometry

Points

A spatial point represents a single location on the Earth. The point is represented by a single coordinate (either, 2-, 3-, or 4- dimensions). Points are used to represent objects when the exact details, such as shape and size, are not important as the target scale.
Specific Spatial Functions working with points:

ST_X(geometry) -- returns the X coordinate
ST_Y(geometry) -- returns the Y coordinate

Linestrings

A linestring is a path between locations. It takes the form of an ordered series of two or points. Roads and rivers are typically represented as linestrings. A linestring is said to be closed if it starts and ends on the same point. It is said to be simple if it does not cross or touch itself (except at its endpoints if it is closed). A linestring can be both closed and simple.
Specific Spatial Functions working with linestrings:

ST_Length(geometry) -- returns the length of the linestring
ST_StartPoint(geometry) -- returns the first coordinate as a point
ST_EndPoint(geometry) -- returns the last coordinate as a point
ST_NPoints(geometry) -- returns the number of coordinates in the linestring

Polygons

A polygon is a representation of an area. The outer boundary of the polygon is represented by a ring. This ring is a linestring that is both closed and simple as defined above. Holes within the polygon are also represented by rings.
Polygons are used to represent objects whose size and shape are important.
GIS systems are relatively unique in allowing polygons to explicitly have holes.
Specific Spatial Functions working with polygons:

ST_Area(geometry) -- returns the area of the polygons
ST_NRings(geometry) -- returns the number of rings (usually 1, more if there are holes)
ST_ExteriorRing(geometry) --returns the outer ring as a linestring
ST_InteriorRing(geometry,n) -- returns a specified interior ring as a linestring
ST_Perimeter(geometry) -- returns the length of all the rings

Collections

There are four collection types, which group multiple simple geometries into sets.
- MultiPoint - a collection of points
- MultiLineString - a collection of linestrings
- MultiPolygon - a collection of polygons
- GeometryCollection - a heterogeneous collection of any geometry (including other collections)
Collections are another concept that shows up in GIS software more than in generic software. They are useful for directly modeling real world objects as spatial objects.
Specific Spatial Functions working with collections:

ST_NumGeometries(geometry) -- returns the number of parts in a collection
ST_GeometryN(geometry,n) --returns the specified part 
ST_Area(geometry) -- returns the total area of all polygonal parts
ST_Length(geometry) -- returns the total length of ll linear parts

Geometry Input and Output

Within the database, geometries are stored on disk in a format used by the PostGIS program. In order for external programs to insert and retrieve useful geometries, they need to be converted into a format that other applications can understand. Fortunately, PostGIS supports emitting and consuming geometries in a large number of formats:
- Well-known Text
- - ST_GeomFromText(text,srid) returns geometry
  - ST_AsText(geometry) returns text
  - ST_AsEWKT(geometry) returns text
- Well-known binary
- - ST_GeomFromWKB(bytea) returns geometry
  - ST_AsBinary(geometry) returns bytea
  - ST_AsEWKB(geometry) returns bytea

- Geographic Mark-up Language
- - ST_GeomFromGML(text) returns geometry
  - ST_AsGML(geometry) returns text
- Keyhole Mark-up Language
- - ST_GeomFromKML(text) returns geometry
  - ST_AsKML(geometry) returns text
- GeoJSON
- - ST_AsGeoJSON(geometry) returns text
- Scalable Vector Graphics
- - ST_AsSVG(geometry) returns text
Most actual processes prefer WKB as the choice of data format.
Other Ways to Create Geometries from Well-known Text or Similar Formatted Inputs:

-- Using ST_GeomFromText with the SRID parameter
SELECT ST_GeomFromText('POINT(2 2)',4326);

-- Using ST_GeomFromText without the SRID parameter
SELECT ST_SetSRID(ST_GeomFromText('POINT(2 2)'),4326);

-- Using a ST_Make* function
SELECT ST_SetSRID(ST_MakePoint(2, 2), 4326);

-- Using PostgreSQL casting syntax and ISO WKT
SELECT ST_SetSRID('POINT(2 2)'::geometry, 4326);

-- Using PostgreSQL casting syntax and extended WKT
SELECT 'SRID=4326;POINT(2 2)'::geometry;

Spatial Relationships

Spatial databases are powerful because they not only store geometry, they also have the ability to compare relationships between geometries.

`ST_Equals`

ST_Equals

tests the spatial equality of two geometries
It returns TRUE if two geometries have the same type and have identical; coordinate values, i.e. fi the second shape is equal to the first shape

`ST_Intersects`, `ST_Disjoint`, `ST_Crosses`, and `ST_Overlaps`

These functions test whether interiors of the geometries intersect

ST_Intersects(geometry A, geometry B)
- returns TRUE if the two shapes have any space in common, i.e. if their boundaries or interiors intersect
ST_Disjoint(geometry A, geometry B)
- if two geometries are disjoint, they do not intersect, and vice-versa. In fact, it is often more efficient to test "not intersects" rather than to test "disjoint" because the intersects tests can be spatially indexed
ST_Crosses(geometry A, geometry B)
- For multipoint/polygon, multipoint/linestring, linestring/linestring, linestring/polygon, and linestring/MultiPolygon comparisons, returns TRUE if the intersection results in a geometry whose dimension is now less than the maximum dimension of the two source geometries and the intersection set is interior to both source geometries
ST_Overlaps(geometry A, geometry B)
- compares two geometries of the same dimension and returns TRUE if their intersection results in a geometry different from both but of the same dimension

`ST_Touches`

Tests whether two geometries touch at their boundaries, but do not intersect in their interiors

ST_Touches

ST_Touches(geometry A, geometry B) returns TRUE if ether of the geometries' boundaries intersect or if only one of the geometry's interiors intersects the other's boundary

`ST_Within` and `ST_Contains`

ST_Within and ST_Contains test whether one geometry is fully within the other.

ST_Within and ST_Contains

ST_Within(geometry A, geometry B) returns TRUE if the first geometry is completely within the second geometry. ST_Within tests for the exact opposite result of ST_Contains
ST_Contains(geometry A, geometry B) returns TRUE if the second geometry is completely contained by the first geometry

`ST_Distance` and `ST_DWithin`

An extremely common GIS question is find all the stuff within distance X of this other stuff

ST_Dwithin

The ST_Distance(geometry A, geometry B) calculates the shortest distance between two geometries and returns it as a float. This is useful for actually reporting back the distance between objects.
For testing whether two objects are within a distance of one another, the ST_DWithin function provides an index-accelerated true/false test.
Spatial joins are the bread and butter of spatial databases. They allow you to combine information from different tables by using spatial relationships such as the join key.

Spatial Indexing

Indexing speeds up searching by organizing the data into a search tree which can be quickly traversed to find a particular record. They are one of the greatest assets of PostGIS.
The larger the table, the larger the relative speed improvement of an indexed query will be.

How Spatial Indexes Work

Standard database indexes create a hierarchal tree based on the values of the column being indexed. Spatial indexes are a little different - they are unable to index the geometric features themselves and instead index the bounding boxes of the features.

Bounding Boxes Demonstration

The way the database efficiently answers the question “what lines intersect the yellow star” is to first answer the question “what boxes intersect the yellow box” using the index (which is very fast) and then do an exact calculation of “what lines intersect the yellow star” only for those features returned by the first test.

For a large table, this “two pass” system of evaluating the approximate index first, then carrying out an exact test can radically reduce the amount of calculations necessary to answer a query.
Both PostGIS and Oracle Spatial share the same "R-Tree" spatial index structure. R-Trees break up data into rectangles, and sub-rectangles, and sub-sub rectangles, etc. It is a self-tuning index structure that automatically handles variable data density, differing amounts of object overlap, and object size.

R-Tree Index

Index Only Queries

Most of the commonly used functions in PostGIS (ST_Contains, ST_Intersects, ST_DWithin, etc.) include an index filter automatically, but some do not.
To do a bounding-box search using the index (and no filtering), make use of the && operator. For geometries, the && operator means bounding box overlap or touch in the same way that for numbers the = operator means values are the same.
Running an indexed query is not always faster.
It is wise to use the ANALYZE command after bulk data loads and deletes in your table. This force the statistics system to gather data for all your indexed columns.

Projecting Data

The earth is not flat, and there is no simple way of putting it down on a flat paper map. Common to all projects is that they transform the (spherical) world onto a flat Cartesian coordinate system, and which projection to choose depends on how you will be using the data.
PostGIS includes built-in support for changing the projection of data, using the ST_Transform(geometry,srid) function. For managing the spatial reference identifiers on geometries, PostGIS provides the ST_SRID(geometry) and ST_SRID(geometry,sid) functions.

Geography

Unlike coordinates in Mercator, UTM, or Stateplane, geographic coordinates are not Cartesian coordinates. Geographic coordinates do not represent a linear distance from an origin as plotted on a plane. Rather, these spherical coordinates describe angular coordinates on a globe. In spherical coordinates a point is specified by the angle of rotation from a reference meridian (longitude), and the handle from the equator (latitude).

Cartesian vs Spherical Coordinates

All return values from geography calculations are in meters.
In order to load geometry data into a geography table, the geometry first needs to be projected into EPSG:4326 (longitude/latitude), then it needs to be changed into geography. The ST_Transform(geometry,srid) function converts coordinates to geographics and the Geography(geometry) function of the ::geography suffix casts to geography.

Native Functions for the Geography Data Type

Geographics are universally accepted coordinates - everyone understands what latitude/longitude mean, but very few people understand what UTM coordinates mean. Why not use geography all the time?
- First, as noted earlier, there are far fewer functions available (right now) that directly support the geography type. You may spend a lot of time working around geography type limitations.
- Second, the calculations on a sphere are computationally far more expensive than Cartesian calculations. For example, the Cartesian formula for distance (Pythagoras) involves one call to sqrt(). The spherical formula for distance (Haversine) involves two sqrt() calls, an arctan() call, four sin() calls and two cos() calls. Trigonometric functions are very costly, and spherical calculations involve a lot of them.
If your data is geographically compact, use the geometry type with a Cartesian projection that makes sense with your data.
If you need to measure distance with a dataset that is geographically dispersed, use the geography type.

Geometry Constructing Functions

Geometry constructing functions take geometries as inputs and output new shapes.
ST_Centroid(geometry) returns a point that is approximately on the center of mass of the input argument.
ST_PointOnSurface(geometry) returns a point that is guaranteed to be inside the input argument.
ST_Buffer(geometry,distance) takes in a buffer distance and geometry type and outputs a polygon with a boundary the buffer distance away from the input geometry.
The ST_Intersection(geometry A, geometry B) function returns the spatial area (or line, or point) that both arguments have in common.
The ST_Union([geometry]) function takes inputs and removes common lines.

Validity

In 90% of the cases the answer to the question, why is my query giving me a 'TypologyException' error is one or more of the inputs are invalid. Which begs the question: what does it mean to be invalid, and why should we care?

Validity is most important for polygons, which define bounded areas and require a good deal of structure. Lines are very simple and cannot be invalid, nor can points.
Rules for valid polygons"
- Polygon rings must close
- Rings that define holes should be inside rings that define exterior boundaries
- Rings may not self-intersect (they may nether touch nor cross themselves)
- Rings may not touch other rings, except at a point
- Elements of multi-polygons may bot touch each other.
The ST_IsValid(geometry) function tells us whether a polygon is valid.
The ST_IsValidReason(geometry) function can tell us the source of the validity.

Dimensionally Extended 9-Intersection Model

The Dimensionally Extended 9-Intersection model (DE9IM) is a framework for modelling how two spatial objects interact.
Every spatial object has:
- An interior
- - The interior is the part of the line bounded by the ends for a line. For a polygon this is obvious.
- A boundary
- An exterior
- - For a line, the exterior is everything in the plane not within the bounds of the line.
The relationships between any pair of special features can be characterized using the dimensionality of the nine possible intersections between the interiors/boundaries/exteriors of a pair of objects.

Relationships Between Polygons

Clustering on Indices

One way to speed up access to data is to ensure that records which are likely to be retrieved together in the same result are located in similar physical locations on the hard disk platters. This is called clustering.
The right clustering scheme to use can be tricky, but a general rule applies: indexes define a natural ordering scheme for data which is similar to the access pattern that will be used in retrieving the data.

Clustering Nearby Locations Together

Clustering based on spatial index makes sense for spatial data that is going to be accessed with spatial queries: similar things tend to have similar locations.

-- Cluster the blocks based on their spatial index
CLUSTER nyc_census_blocks USING nyc_census_blocks_geom_idx;

Clustering is still necessary in cache databases. Keeping records that are near each other in space near each other in memory increases the odds that related records will move up the servers memory cache hierarchy together, and thus make memory accesses faster.

3D

PostGIS supports additional dimensions on all geometry types, a Z dimension to add height information and a M dimension for additional dimensional information (commonly time, or road-mile, or upstream-distance information) for each coordinate.
Adding the extra dimension results in three extra possible geometry types for each geometry primitive:
- Point (a 2-D type) is joined by PointZ, PointM and PointZM types.
- Linestring (a 2-D type) is joined by LinestringZ, LinestringM and LinestringZM types.
- Polygon (a 2-D type) is joined by PolygonZ, PolygonM and PolygonZM types.
Functions built to calculate relationships between 3-D objects:
- ST_3DClosestPoint
- - Returns the 3-dimensional point on g1 that is closest to g2. This is the first point of the 3D shortest line.
- ST_3DDistance
- - For geometry type Returns the 3-dimensional cartesian minimum distance (based on spatial ref) between two geometries in projected units.
- ST_3DDWithin
- - For 3d (z) geometry type Returns true if two geometries 3d distance is within number of units.
- ST_3DDFullyWithin
- - Returns true if all of the 3D geometries are within the specified distance of one another.
- ST_3DIntersects
- - Returns TRUE if the Geometries “spatially intersect” in 3d - only for points and linestrings
- ST_3DLongestLine
- - Returns the 3-dimensional longest line between two geometries
- ST_3DMaxDistance
- - For geometry type Returns the 3-dimensional cartesian maximum distance (based on spatial ref) between two geometries in projected units.
- ST_3DShortestLine
- - Returns the 3-dimensional shortest line between two geometries

Nearest-Neighbor Searching

A frequently posed spatial query is: what is the nearest <candidate feature> to <query feature>?
Unlike a distance search, the nearest neighbor search doesn't include any measurement restricting how far away candidate geometries might be, features of any distance away will be accepted, as long as they are the nearest.
PostgreSQL solves the nearest neighbor problem by introducing an order by distance (<->) operator that induces the database to use an index to speed up a sorted return set. With an order by distance operator in place, a nearest neighbor query can return the N nearest features just by adding an ordering and limiting the result set to N entries.
The order by distance operator works on geography and geometry types.
- Returns the same as ST_Distance for geometry
- Returns the spherical distance for geography

-- Plug the geometry into a nearest-neighbor query
SELECT streets.gid, streets.name,
  ST_Transform(streets.geom, 4326),
  streets.geom <-> 'SRID=26918;POINT(583571.9 4506714.3)'::geometry AS dist
FROM
  nyc_streets streets
ORDER BY
  dist
LIMIT 3;

Rasters

PostGIS supports another kind of spatial data type called a raster. Raster data, like geometry data, uses Cartesian coordinates and a spatial reference system. Raster data is represented as an n-dimensional matrix consisting of pixels and bands. The bands defines the number of matrices you have. Each pixel stores a value corresponding to each band. So a 3-banded raster such as an RGB image, would have 3 values for each pixel corresponding to the Red-Green-Blue bands.
A raster is a matrix, pinned on a coordinate system, that has values that can represent anything you want them to represent.
Since rasters live in cartesian space, rasters can interact with geometries. PostGIS offers many functions that take as input both raters and geometries.

Working with Rasters PostGIS

Topology

PostGIS supports SQL/MM SQL-MM 3 Top-Geo and Top-Net 3 specifications via an extension called postgis_topology. You can learn about the functions and types provided by the extension in Manual: PostGIS Topology . The postgis_topology extension includes another kind or core spatial type, called a topogeometry. In addition to the topogeometry spatial type, you will find functions for building topologies and populating topologies.

Tracking Edit History Using Triggers

A common requirement for production databases is the ability to track history: how has the data changed between two dates, who made the changes, and where did they occur? Using the database and the trigger system, it's possible to add history tracking to any table, while maintaining simple direct edit access to the primary table.

Basic PostgreSQL Tuning

In order to ensure the database will run properly for many different environments, the default configuration is very conservative and not terribly appropriate for high-performance production databases. Add the fact that geospatial databases have different usage patterns , and the data tend to consist of fewer, much larger records than non-geospatial databases, and you can see that the default configuration will not be totally appropriate for our purposes.
All of these configuration parameters can be edited in the postgresql.conf database configuration file. This is a regular text file and the changes will not take effect until the server is restarted.
- Once changes are made, reload configuration

`shared_buffers`

Sets the amount of memory the database server uses for shared memory buffers. They are shared amongst back-end processes.
- Default value: typically 32MB
- Recommended Value: about 75% of database memory up to a max of about 2GB

`effective_cache_size`

The effective_cache_size is approximately the amount of memory on the machine, less the shared_buffers, less the work_mem times the expected number of connections, less any memory required for any other processes running on the machine, less about 1GB for other random operating system needs.
- Default Value: typically 4GB
- Recommended Value: any amount of free memory expected to be around ordinary operating conditions

`work_mem`

Defines the amount of memory that internal sorting operations, indexing operations, and hash tables can consume before the database switches to on-disk files. This value defines the available memory for each operation; complex queries may have several sort or hash operations running in parallel, and each connected session may be executing a query.

As such you must consider how many connections and the complexity of expected queries before increasing this value. The benefit to increasing is that the processing of more of these operations, including ORDER BY, and DISTINCT clauses, merge and hash joins, hash-based aggregation and hash-based processing of subqueries, can be accomplished without incurring disk writes. The cost of increasing is memory that will be used per connection, which can be quite high with production levels of connections.

- Default Value: 1MB
- Recommended Value: 32MB

`maintenance_work_mem`

Defines the amount of memory used for maintenance operations, including vacuuming, index, and foreign key creation.
- Default Value: 16MB
- Recommended Value: 128MB

`wal_buffers`

Sets the amount of memory used for write-ahead-log data.
- Default Value: 64kB
- Recommended Value: 1MB

`checkpoint_segments`

This value sets the maximum number of log file segments (typically 16MB_ that can be filled between automatic WAL checkpoints.
- Default Value: 3
- Recommended Value: 6

`random_page_cost`

This is a unit-less value that represents the cost of a random page access from disk.
- Default Value: 4.0
- Recommended Value: 2.0

`seq_page_cost`

This is the parameter that controls the cost of a sequential page access.
- Default Value: 1.0
- Recommended Value: 1.0

PostgreSQL Schemas

Schemas are like folders, and they can hold tables, views, functions, and other relations. Each database starts out with one schema, the public schema.
- Inside that schema, the default install of PostGIS creates the geometry_columns, geography_columns, and spatial_ref_sys metadata relations, as well as the types and functions used by PostGIS.
Reasons for using schemas:

Data that is managed in a schema is easier to apply bulk actions to
Users can be restricted in their work to single schemas to allow isolation of analytical and test tables from production tables.

PostGIS

Things to know about geospatial data:

Common Vector Data File Formats

GeoJSON

Shapefiles

GPS Exchange Format (GPX)

Keyhole Markup Language (or KML)

MapInfo TAB

What is a Spatial Database

Spatial Data Types

Spatial Indexes and Bounding Boxes

Spatial Functions

Metadata Tables

Representing Real World Objects

Points

Linestrings

Polygons

Collections

Geometry Input and Output

ST_Equals

ST_Intersects, ST_Disjoint, ST_Crosses, and ST_Overlaps

ST_Touches

ST_Within and ST_Contains

ST_Distance and ST_DWithin

How Spatial Indexes Work

Index Only Queries

Dimensionally Extended 9-Intersection Model

shared_buffers

effective_cache_size

work_mem

maintenance_work_mem

wal_buffers

checkpoint_segments

random_page_cost

seq_page_cost

User Comments

`ST_Equals`

`ST_Intersects`, `ST_Disjoint`, `ST_Crosses`, and `ST_Overlaps`

`ST_Touches`

`ST_Within` and `ST_Contains`

`ST_Distance` and `ST_DWithin`

`shared_buffers`

`effective_cache_size`

`work_mem`

`maintenance_work_mem`

`wal_buffers`

`checkpoint_segments`

`random_page_cost`

`seq_page_cost`