Principles of Geographic Information Systems

I am planning on creating an application that involves geospatial data, so I want to learn more about it. I am going to read this textbook, Principles of Geographic Information Systems, to try to learn more about it.

Date Created:

Last Edited:

2 552

References

Principles of Geographic Information Systems, by Otto Huisman and Rolf A. de By

A Gentle Introduction to GIS

The Nature of GIS

The acronym GIS stands for geographic information system. As the name suggests, a GIS is a tool for working with geographic information.
Spatial data refers to where things are, or perhaps, where they were or will be.
When professionals deal with questions related to geographic space, they are dealing with positional data relative to the Earth's surface.
Geographic information can change over time.

Defining GIS

Definition: A GIS is a computer-based system that provides the following four sets of capabilities to handle georeferenced data:
1. Data capture and preparation
2. 1. Refers to the process of capturing data - usually with some sensors
3. Data management, including storage and maintenance
4. 1. Data management refers to the storage and maintenance of the data transmitted by the buoys via satellite communication.
  2. This phase requires a decision to be made on how to best represent the data, both in terms of their spatial properties and the various attribute values that we need to store.
5. Data manipulation and analysis
6. 1. Once the data has been collected and organized in a computer system, we can start analyzing it.
  2. A typical GIS function: deriving an estimated value for a property for some location where we have not measured.
7. Data presentation
8. 1. The data presentation phase deals with putting it all together into a format that communicates the result of data analysis in the best possible way.

This implies that a GIS user can expect support from the system to enter (geo-referenced) data, to analyze it in various ways, and to produce presentations (including maps and other types) from the data. This would include support for various kinds of coordinate systems and transformations between them, options for analysis of the georeferenced data, and a large degree of freedom in the way this information is presented.
Data is georeferenced if it is associated with some position on Earth's surface, by using a spatial reference system. This can be achieved using coordinates, or by other means.

GISystems, GIScience, and GIS applications

The disciple that deals with all aspects of handling spatial data and geoinformation is called geographic information science (often abbreviated to geo-information science or just GIScience).

Geo-Information Science is the scientific field that attempts to integrate different disciplines studying the methods and techniques of handling spatial information.

By data, we mean representations that can be operated upon by a computer. More specifically, by spatial data, we mean data that contains positional coordinates, such as coordinates. Sometimes the more precise phrase geospatial data is used as a further refinement, which refers to spatial data that is georeferenced.
By information, we mean data that has been interpreted by a human being. Geoinformation is a specific type of information resulting form the interpretation of spatial data.
The National Mapping Agencies (NMAs) are responsible for collecting topographical data for the entire country following pre-set standards.
Key components of spatial data quality include:
- positional accuracy - both horizontal and vertical
- temporal accuracy - that the data is up to date
- attribute accuracy - in labelling the features of of classifications
- lineage - history of the data including sources
- completeness - the data set represents all related features of reality
- logical consistency - data is logically structured

Real World and Representations

A representation of some part of the real world can be considered a model because ethe representation will have some characteristics in common with the real world.
In the GIS environment, the most familiar model is that of a map. A map is a miniature representation of some part of the real word.
A GIS must store its data in some way. Spatial Databases (also known as geodatabases) can store representations of real world geographic phenomenon for use in GIS. These databases are special because they use additional techniques different from tables to store these spatial representations.
A GIS is tailored to operate on spatial data. It knows about spatial reference systems, and supports all kinds of analyses that are inherently geographic in nature, such as distance and area computations and spatial interpolation.

Geographic Information and Spatial Data Types

Models and Representations

Modelling is the process of producing an abstraction of the 'real world' so that some part of it can be more easily handled.

Representing Real World Data in GIS

A GIS operates under the assumption that the relevant spatial phenomena occur in a two- or three-dimensional Euclidean space, unless otherwise specified.

Geographic Phenomena

Euclidean space can be informally defined as a model of space in which locations are represented by coordinates - in 3D; in 3D - and distance and direction can be defined with geometric formulas. In the 2D case, this is known as the Euclidean plane, which is the most common Euclidean space in GIS use.
We might define a geographic phenomenon ad a manifestation of an entity or process of interest that:
- Can be named or described
- Can be georeferenced
- Can be assigned a time

A (geographic) field is a geographic phenomenon for which, for every point in the study area, a value can be determined.

- Examples:
- - Air Temperature
  - Barometric Pressure
  - Elevation
- Fields can be discrete or continuous.
- - In a continuous filed, the underlying function is assumed to be mathematically smooth, meaning that the field values along any path through the study area do not change abruptly, but only gradually. A continuous field can even be differentiable, meaning we can determine a measure of change per unit of distance anywhere in any direction.
  - - Examples: air temperature, barometric pressure, elevation
    - These fields are usually floating point.
  - Discreate fields divide the study space in mutually exclusive, bounded parts, with all locations in one part having the same field value.
  - - Examples: land classifications, geological classes, soil type, land use type
    - These fields will store cell values of type 'integer'. These fields can be easily converted to polygons.

(Geographic) objects populate the study area, and are usually well distinguished, discrete, and bounded entities. The space between them is potentially empty or undetermined.

Data Values which we can represent with our phenomena:
- Nominal data values - values that provide a name or identifier so that we can discriminate between different values, but that is about all we can do.
- Ordinal data values - data values that can be put in some natural sequence but that do not allow any other type of classification.
- Interval data values - quantitative, in that they allow simple forms of computation like addition and subtraction
- Ratio data values - allow most, if not all, of arithmetic computations.
When a geographic phenomenon is not present everywhere in the study area, but somehow sparsely populates it, we look at it as a collection of geographic objects. Such objects are easily distinguished and named, and their position in space is determined by a combination of one of the following parameters:
- Location
- Shape
- Size
- Orientation
Shape is usually important because one of its factors is dimension. This relates to whether an object is perceived as a point feature, or a linear, area, or volume feature.
We often look at collections of geographic objects viewed as a unit.
Crisp Boundaries is one that can be determined with almost arbitrary precision. Fuzzy boundaries are boundaries that are not a precise line, but rather areas of transition.
- Crisp boundaries are more common in man-made phenomena, whereas fuzzy boundaries are more common with natural phenomena.

Computer Representations of Geographic Information

When storing geoinformation, we store a finite, but intelligently chosen set of (sample) locations with their elevation. We can use interpolation functions that allow us to infer a reasonable elevation value for locations that are not stored.
Spatial autocorrelation - fundamental principle which refers to the fact that locations that are closer together are more likely to have similar values than locations that are far apart (commonly referred to a 'Tobler's first law of Geography').
Fields are usually implemented with a tessellation approach, and objects with a (topological) vector approach.
A tessellation is a partitioning of space into mutually exclusive cells that together make up the complete study space. With each cell, some (thematic) value is associated to characterize that part of space.
In a regular tessellation, the cells are the same shape and size, and the field attribute value is assigned to a cell is associated with the entire area occupied by the cell. The square tessellation is by far the most commonly used, mainly because georeferencing a cell is so straightforward. These tessellations are known under various names in different FIS packages, but most frequently known as rasters.

Raster Example

A raster is a set of regularly spaced (and contiguous) cells with associated (field) values. The associated values represent cell values, not point values. This means that the value for a cell is assumed to be valid for all locations within the cell.

The size of the area that a single raster cell represents is called the raster's resolution. Sometimes, the word grid is also used, but strictly speaking, a grid refers to values at the intersections of a network of regularly spaced horizontal and perpendicular lines.

Gris vs Cells Raster

The field value of a cell can be interpreted as one for the complete tessellation cell, in which the field value is discrete, not continuous or even differentiable. Some convention is needed to state which value prevails on cell boundaries; with square cells, this convention often says that lower and left boundaries belong to the cell. To improve on this continuity issue, we can do two things:

Make the cell size smaller, so as to make the continuity gaps between the cells smaller and/or
Assume that a cell value only represents elevation for one specific location in the cell, and to provide a good interpolation function for all other locations that has a continuity characteristic.

An important advantage of regular tessellations is that we know how they partition space, and we can make our computations specific to this partitioning. This leads to fast algorithms.
An obvious disadvantage is that they are not adaptive to the spatial phenomenon we want to represent.
Regular tessellations might not represent data in the most efficient way. For this reason, substantial research effort has also been put into irregular tessellations.
- These are partitions of space into mutually disjoint cells, but now the cells may vary in size and shape, allowing them to adapt to the spatial phenomena that they represent. We discuss here only the region quadtree but there are many more structures that have been proposed.
- Irregular tessellations are more adaptive, which leads to a reduction in the amount of memory used to store the data.
- Region Quadtree is based on the regular tessellation of squarer cells, but takes advantage of cases where neighboring cells have the same field value, so that they can together be represented as one bigger cell.

Region Quadtree Raster

Quadtrees are adaptive because they apply the spatial autocorrelation principle - the locations that are near in space are likely to have similar field values.
Quadtree provides a nested tessellation: quadrants are only split if they have two or more values.

Tessellations do not explicitly store georeferences of the phenomena they represent. Instead, they provide a georeference of some point in the raster plus an indication of the raster's resolution. In vector representations, an attempt is made to explicitly associate georeferences with the geographic phenomena.

Vector Representation

A commonly used data structure in GIS software is the triangulated irregular network, or TIN. It is one of the structured implementation techniques for digital terrain models, but it can be used to represent any continuous field.
- TIN is built from random 3D points. These 3D points can be sed to construct an irregular tessellation made of triangles. In 3D space, three points uniquely determine a plane, as long as they are not collinear. A plane fitted through these points has a fixed aspect and gradient, and can be used to compute an approximation of elevation of other locations.

Vector Traingulation

If we restrict the use of a plane to the area between its three anchor points, we obtain a triangular tessellation of the complete study space.
Two important properties of defining triangulation:
- The first is that triangles are as equilateral as they can be, given a set of anchor points.
- The second property is that for each triangle, the circumcircle through is three anchor points does not contain any other anchor point.
A TIN is a vector representation; the cells do not have an associated stored value, but rather a simple interpolation function that uses the elevation values of its three anchor points.
Points are either 2D or 3D that are used to represent object that are best described as shape- and size-less, one-dimensional features. Extra data stored for each point object is usually called attribute or thematic data and can capture anything that is considered relevant about the object.
Two end nodes and zero or more internal nodes or vertices define a line. Other terms for line are arc and edge. A node or vertex is like a point.
The straight parts of a line between two consecutive vertices or end nodes are called line segments. Many GISs store a line as a simple sequence of coordinates of its end nodes and vertices, assuming that all its segments are straight.

Lines

Collections of (connected) lines may represent phenomena that are best viewed as networks.
When area object are stored using a vector approach, the usual technique is to apply a boundary model. This means that each area feature is represented by some arc / node structure that determines a polygon as the area's boundary.

Area Representation

The boundary model is an area representation that stores parts of a polygon's boundary as non-looping arcs and indicates which polygon is on the left and which is on the right of each arc.
- This model improves on the data-redundancy and time issues that are present in other models.
- The boundary model is also called the topological data model as it captures some topological information, such as polygon neighborhood.

Boundary Model

Topology and Spatial Relationships

General Spatial Topology

Topology deals with spatial properties that do not change under certain transformations.

Topology refers to the spatial relationships between geographical elements in a data set that do not change under a continuous transformation.

Topological relationships are built from simple elements into more complex elements: nodes define line segments, and line segments connect to define lines, which in turn define polygons. The fundamental issues relating to order, connectivity and adjacency of geographical elements form the basis of more sophisticated GIS analysis. The se relationships 9called topological properties) are invariant under a continuous transformation, referred to as a topological mapping.

Topological Relationships

The mathematical properties of the geometric space used for spatial data can be described as follows:
- The space is a three-dimensional Euclidean space where for every point we can determine its three-dimensional coordinates as a triple of real numbers.
- The space is a metric space - we can always compute the distance between two points according to a given distance function.
- The space is a topological space - for every point in the space we can find a neighborhood around it that filly belongs to that space as well.
- Interior and boundary are properties of spatial features that remain invariant under topological mapping.

Simplex

Within the topological space, we can define features that are easy to handle and that can be used as representatives of geographic objects. These features are called simplices as they are the simplest geometric shapes of some dimension. Point = 0-simplex, line segment = 1-simplex, triangle = 2-simplex, tetrahedron = 3-simplex
When we combine various simplices into a single feature, we obtain a simplicial complex.

The Topology of Two Dimensions

The interior of a region is the largest set of points for which we can construct a disk-like environment around it that also falls completely inside . The boundary of is the set of those points belonging to that do not belong to the interior of .

Spatial relationships Between Two Regions

The relationships above can be used in queries against a spatial database, and represent the building blocks of more complex spatial queries.

The five rules of topological consistency in two-dimensional space:

Every 1-simplex ('arc') must be bounded y two 0-simplices ('nodes', namely its begin and end node)
Every 1-simplex borders two 2-simplices ('polygons', namely its 'left' and 'right' polygons)
Every 2-simplex has a closed boundary consisting of an alternating (and cyclic) sequence of 0- and 1-simplices
Around every 0-simplex exists an alternating (and cyclic) sequence of 1- and 2-simplices.
1-simplices only intersect at their (bounding) nodes.

Rules for Topological Consistency 2-D

The Three Dimensional Case

The history of spatial data handing is almost purely 2D, and this remains the case for the majority of GIS applications.

An important class of solids in 3D GIS is formed by the polyhedra, which are the solids limited by planar facets. A facet is polygon-shaped, flat side that is part of the boundary of a polyhedron.

Scale and resolution

Map scale can be defined as the ratio between the distance on a paper map and the distance on the same stretch in the terrain.
When applied to spatial data, the term resolution is commonly associated with the cell width of the tessellation applied.
Digital spatial data, as stored in GIS, is essentially without scale.

Representations of Geographic Fields

A geographic field can be represented through a tessellation, through a TIN or through a vector representation.
It is more common to use tessellations, notably rasters, for field representation, but vector representations are in use too.

Raster Representation of a Field

Raster Representation Field

a raster can be thought of as a long list of field values: actually, there should be such values. The list if preceded with some extra information, like a single georeference as the origin of the whole raster, a cell size indicator, the integer values for and , and a data type indicator for interpreting cell values.
A TIN is a much sparser data structure: the amount of data stored is less if we try to obtain a structure with approximately equal interpolation error, as compared to a regular raster.

Vector Representation of a Field

An isoline is a linear feature that connects the points with equal field value. When the field is elevation, we also speak of contour lines.

Vector representation of a Field

Isolines as a representation mechanism are not very common, however. They are in use as a geoinformation visualization technique (in mapping, for instance), but commonly using a TIN for representing this type of field is the better choice.

Representation of Geographic Objects

The representation of geographic objects is most naturally supported with vectors. However, tessellations are still commonly sued for representing geographic objects as well.

Tessellations to Represent Geographic Objects

Remotely sensed images are an important data source for GIS applications. Various techniques exist to process digital images into classified images that can be stored in a GIS as a raster. Image classification attempts to characterize each pixel into one of a finite list of classes thereby obtaining an interpretation of the contents of the image.

How the process of image classification takes place is not the subject of this book. It is dealt with in Principles of Remote Sensing.

Organizing and Managing Spatial Data

The main principle of data organization applied in GIS systems is that of a spatial data layer. A spatial data layer is either a representation of a continuous or discrete field, or a collection of objects of the same kind.

Spatial data layer

Usually, the data is organized so that similar elements are in a single data layer (all telephone booth objects in one layer, and all road line objects in another).
A data layer contains spatial data - of any of the types discussed above - as well as attribute (or: thematic) data, which further described the field of objects in the layer. Attribute data is quote often arranged in tabular form, maintained by some kind of geodatabase.

The Temporal Dimension

Geographic phenomena are also dynamic; they change over time.
GISs still offer limited support for the representations of time.
Spatiotemporal data models are ways of organizing representations of space and time in a GIS. Brief examination of different concepts of time:
- Discrete and Continuous Time - discrete time is composed of discrete elements (seconds); in continuous time, no such discrete elements exist.
- Valid and Transaction Time - valid time is the time when an event really happened. Transaction time is the time when the event was stored in the database or GIS.
- Linear, Branching and Cyclic Time
- Time Granularity - the precision of time value in a GIS or database
- Absolute and Relative Time

Data Management and Processing Systems

The ability to manage and process spatial data is a critical component for any functioning GIS. Simply put, data processing systems refer to hardware and software components which are able to process, store, and transfer data. This chapter discusses the components of systems that facilitate the management and processing of geoinformation.

Computers are becoming more powerful = Section 3.1

Geographic Information Systems

GIS Software

The main characteristics of a GIS software package are its analytical functions that provide means for deriving new geoinformation from existing spatial and attribute data.
The discipline of geographic information science is driven by the use of our GIS tools, and these are in turn improved by new insights and information gained through their application in various scientific fields. Spatial information theory is one such field, which focuses specifically on providing the background for the production of tools for the handling of spatial data.
Well-known, full-fledged GIS packages include ILWIS, Integraph's GeoMedia, ESRI's, ArcGIS, and MapInfo from Map Info Corp.
A GIS consists of several functional components - components which support key GIS functions. These are data capture and preparation, data storage, data analysis, and presentation of spatial data.

Functional Components of GIS

A Spatial Data Infrastructure (SDI) is defined as the relevant base collection of technologies, policies, and institutional arrangements that facilitate the availability of and access to spatial data.

Stages of Spatial Data Handling

The functions for capturing data are closely related to the disciplines of surverying engineering, photogrammetry, remote sensing, and the process of digitizing - the conversion of analogue data into digital representations. Remote sensing, in particular, is the field that provides photographs and images as the raw base data from which spatial data sets are derived. Surveys of the study area often need to be conducted for data that cannot be obtained with remote sensing techniques, or to validate data thus obtained.

Comparison Between Raster and Vector Storage

GIS software packages provide support for both spatial and attribute data - they accommodate spatial data storage using a vector approach and attribute data using tables. Historically, database management systems (DBMSs) have been based on the notion of tables for data storage.

Spatial Query and Analysis

The most distinguishing parts of a GIS are its functions for spatial analysis, i.e. operators that use spatial data to derive new geoinformation. Spatial queries and process models play an important role in this functionality. Spatial decision support systems (SDSS) are a category of information systems composed of a database, GIS software, models, and so-called knowledge engine which allow users to deal specifically with location problems.
Analysis of spatial data can be defined as computing new information that provides new insight from the existing, stored spatial data.

Database Management Systems

A database is a large, computerized collection of structured data.

A database management system (DBMS) is a software package that allows the user to set up, use and maintain a database.

A query is a computer program that extracts data from the database that meet the conditions in the query.

A data model is a language that allows the definition of:
- The structures that will be used to store the base data.
- The integrity constraints that the stored data has to obey at all moments in time, and
- The computer programs used to manipulate the data.

A table or relation is itself a collection of tuples (or records). In fact, each table is a collection of tuples that are similarly shaped.

An attribute is a named field of a tuple, with which each tuple associates a value, the tuple's attribute value.

I know about DBMSs - I can skip over this section.

GIS and Spatial Databases

Spatial Database Functionality

There is additional functionality needed by DBMS in order to process and manage spatial data.
During the 1990s, object oriented and object-relational data models were developed to better represent and manage spatial data.
Currently, GIS software packages are able to store spatial data using a range of open source DBMSs, including PostgreSQL.
Spatial databases, also known ad geodatabases, are implemented directly on DBMSs, using extension software to allow them to handle spatial objects.

A spatial database allows users to store, query, and manipulate collections of spatial data.

Spatial data can be stored in a special database column, known as the geometry column, (or feature or shape, depending on the specific software package). This means GISs can really fully on DBMS support for spatial data, making use of a DBMS for data query and storage, and GIS for spatial functionality.

Geometry Data Stored in Spatial Database Table

The Open Geospatial Consortium (OGC) has released a series of standards relating to geodatabases that (amongst other things), define:
- Which tables must be present in a spatial database
- The data formats, called Simple Features (i.e. point, polygon, etc.)
- A set of SQL-like instructions for geographic analysis
The architecture of a spatial database differs from a standard RDBMS not only because it can handle geometry data and manage projections, but also for a larger set of commands that extend standard SQL language
A Spatial DBMS provides support for geographic co-ordinate systems and transformations. It also provides storage of the relationships between features, including the creation and storage of topographical relationships.

Spatial Referencing and Positioning

Spatial Referencing

A frequently occurring issue is the need to combine spatial data from different sources that use different spatial reference systems.

Reference Surfaces for Mapping

The surface of the Earth is anything but uniform. The oceans can be treated as reasonably uniform, but the surface or topology of the land masses exhibits large vertical variations between mountains and valleys. These variations make it impossible to approximate the shape of the Earth with any reasonably simple mathematical model.
Two main reference surfaces have been established to approximate the shape of the Earth:
- One of the reference surfaces is called the Geoid
- The other reference surface is the ellipsoid

Earth Reference Surfaces

The Geoid

If we imagine the Earth's surface is covered by water and ignore tidal / current effects, the resulting surface is affected only by gravity. This has an effect on the shape of this surface because the direction of gravity-more commonly known as the plumb-line - is dependent on the mass distribution inside the Earth. Due to irregularities or mass anomalies in this distribution the global ocean results in an undulated surface. This surface is called the Geoid.

Geoid Reference

The Geoid is used to describe heights. In order to establish the Geoid as reference for heights, the ocean's water level is registered at coastal places over several years using tide gauges (mareographs). Averaging the registrations largely estimates variations of the sea level with time. The resulting water level represents an approximation to the Geoid and is called the mean sea level.
The height determined with respect to a tide-gauge station is known as the orthometric height (height H above the Geoid).

Leveling Network

The Ellipsoid

The most convenient geometric reference for the description of horizontal coordinates of points of interest projected onto a mapping plane is the oblate ellipsoid.
An ellipsoid is formed when an ellipse is rotated about its minor axis. This ellipse which defines an ellipsoid or spheroid is called a meridian ellipse.

The Global Horizontal Datum

With increasing demands for global surveying activities are underway to establish global reference surfaces. The motivation is to make geodetic results mutually comparable and provide coherent results to other disciplines like astronomy and geophysics.
The most important global (geocentric) spatial reference system for the GIS community is the International Terrestrial System (ITRS). It is a three-dimensional coordinate system with a well defined origin (the center of mass of the Earth) and three orthogonal coordinate axes .
- The Z-axis points towards a mean Earth north pole. The X-axis is oriented towards a mean Greenwhich meridian and is orthogonal to the Z-ais. The Y-axis completes the right handed reference coordinate system.

Coordinate Systems

Different kinds of coordinate systems are used to position data in space. Here we distinguish between spatial and planar coordinate systems. Spatial coordinate systems are used to locate data either on the Earth's surface in a 3D space, or on the Earth's reference surface in 2D space.

2D Geographic Coordinates

The most widely used global coordinate system consists of lines of geographic latitude () and longitude ().
Lines of equal latitude are called parallels. They form circles on the surface of the ellipsoid. Lines of equal longitude are called meridians and they form ellipses on the ellipsoid.

2D Geographic Coordinates

Latitude is 0 on the equator. Latitude is 0 at the meridian of Greenwich. Latitude and longitude are always given in angular units.

3D Geographic Coordinates

3D geographic coordinates are obtained by introducing the ellipsoidal height to the system.

Ellipsoidal heights

3D Geocentric Coordinates
2D Cartesian Coordinates
2D Polar Coordinates

Map Projection

We know that that the Earth's surface is curved in a specific way, and we know that a map is in fact a flattened representation of some part of the planet. The field of map projections concerns itself with the ways of translating the curved surface of the Earth into a flat map.

A map projection is a mathematically described technique of how to represent the Earth's curved surface on a flat map.

To represent parts of the surface of the Earth on a flat paper map or on a computer screen, the curved horizontal reference surface must be mapped onto the 2D mapping plane. The reference surface for large scale mapping is usually on oblate ellipsoid, and for small-scale mapping, a sphere.

Map Projection

A forward mapping equation transforms the geographic coordinates of a point on the curved reference surface to a set of planar Cartesian coordinates representing the position of the same point on a map plane. The corresponding inverse mapping equation transforms mathematically the planar Cartesian coordinates of a point on the map plane to a set of geographic coordinates on the curved reference surface.
An example is the mapping equations used for the Mercator projection (spherical assumption):

All map projections have scale distortions. There is no way to flatten out a piece of ellipsoidal or spherical surface without stretching some parts of the surface more than others.

Some map projections can be visualized as true geometric projections directly onto the mapping plane, in which case we call it an azimuthal projection, or onto an intermediate surface, which is then rolled out onto a mapping plane. Such map projections are then called conical, and cylindrical, respectively.

Classes of Map Projections

The distortion properties of a map are typically classified according to what is not distorted on the map:
- In a conformal map projection, the angles between lines in the map are identical to the angles between the original lines on the curved reference surface.
- In an equal area map projection the areas in the map are identical to the areas on the curved reference surface
- In an equidistant map projection, the length of particular lines in the map are the same as the length of the original lines on the curved reference surface.

Note:
I am going to skip to the sections that will help me understand PostGIS better now. I may come back to learn more about this in the future, but understanding PostGIS is why I am reading this textbook in the first place, and the information here is just informative generally - not for my immediate needs. I'm on page 238.

Summary

Each projection and datum has particular characteristics that make it useful fro specific mapping purposes. A projection is chosen to minimize errors for the area and relevant to the scale of the mapping project being undertaken, and the required distortion property, which in turn depends on the purpose for which the map will be used. We need to be aware of issues brought about by combination of spatial data from different sources that use different reference systems. The issue is becoming increasingly important, as more and more data is being shred. Often transformations are necessary to enable the combination of disparate data layers.

Data Entry and Preparation

Data which is captured directly from the environment is known as primary data.
An image refers to raw data produced by an electronic sensor, which are not pictorial, but arrays of digital numbers related to some property of an object or scene, such as the amount of reflected light.
Any data which is not captured directly from the environment is known as secondary data.

The process of distilling points, lines and polygons from a scanned image is known as vectorization.

Vectorization

Metadata is defines as background information that describes all necessary information about the data itself. More generally, it is known as 'data bout data'. This includes:
- Identification Information: Data source(s), time of acquisition, etc.
- Data Quality Information: Positional, attribute and temporal accuracy, lineage, etc.
- Entity and Attribute Information: Related attributes, units of measure, etc.
Precision is the smallest unit of measurement to which data can be recorded.
Lineage describes the history of a dataset.

Cleaning Up Data:

Cleaning Up Data

vectorization produces a vector data set form a raster.,
merging of data sets:

Merging data sets

Spatial Data Analysis

Classification of analytical functions of GIS:

Classification, retrieval, and measurement functions
1. All functions in this category are performed on a single (vector or raster) data layer, often using the associated attribute data
Overlay functions
1. They allow the combination of two or more spatial data layers comparing them position by position, and treating areas of overlap - and of non-overlap - in distinct ways. Many GISs support overlays through an algebraic language, expressing an overlay function as a formula in which the data layers are the arguments
Neighborhood functions
1. Where overlays combine features at the same location, neighborhood functions evaluate the characteristics of an area surrounding a feature's location. A neighborhood function 'scans' the neighborhood of the given features and performs computation on it.
Connectivity functions
1. These functions work on the basis of networks, including road networks, water courses in coastal zones, and communication lines in mobile telephony. These networks represent spatial linkages between features.

Minimal Bounding Box

Data Visualization

Maps, a representation or abstraction of geographic reality - a tool for presenting geographic information in way that is visual, digital, or tactile - can be used as an input for GIS.
Maps are divided into topographical maps, which visualize Earth's surface as accurately as possible, and thematic maps, which represent the distribution of particular themes (e.g., socioeconomic).

Topographical Maps

Thematiic Maps

Different Ways of Visualizing Stuff on Map

Principles of Geographic Information Systems

References

Defining GIS

GISystems, GIScience, and GIS applications

Topology and Spatial Relationships

General Spatial Topology

Topological Relationships

The Topology of Two Dimensions

The Three Dimensional Case

Scale and resolution

Representations of Geographic Fields

Raster Representation of a Field

Vector Representation of a Field

Representation of Geographic Objects

Tessellations to Represent Geographic Objects

GIS Software

Stages of Spatial Data Handling

Spatial Query and Analysis

Database Management Systems

Spatial Database Functionality

Spatial Referencing

Reference Surfaces for Mapping

The Geoid

The Ellipsoid

The Global Horizontal Datum

Coordinate Systems

2D Geographic Coordinates

3D Geographic Coordinates

Map Projection

Summary

User Comments