PostgreSQL PL/pgSQL
I need to review the Postgres PL/pgSQL SQL procedural language because it is what I use to write Postgres functions.
References
Notes
Overview
PL/pgSQL is a loadable procedural language for the PostgreSQL database system.
Goals of the language:
- can be used to create functions, procedures, and triggers,
- adds control structures to the SQL language
- can perform complex computations
- inherits all user-defined types, functions, procedures, and operators
- can be defined to be trusted by the server
- easy to use
Advantages
With PL/pgSQL you can group a block of computation and a series of queries inside the database server, thus having the power of a procedural language and the ease of use of SQL, but with considerable savings of client/server communication overhead.
- Round trips between client and server are eliminated
- Intermediate results that the client does not need do not have to be marshaled or transferred between server and client
- Multiple rounds of query parsing can be avoided
This can result in a considerable performance increase as compared to an application that does not use stored functions.
Supported Argument and Result Data Types
Functions written in PL/pgSQL can accept as arguments any scalar or array data type supported by the server, and they can return the result of any of these types. They can also accept or return any composite type (row type) specified by name. It is also possible to declare a PL/pgSQL function as accepting record
, which means that any composite type will do as input, or as returning record
, which means that the result is a row type whose columns are determined by the specification in the calling query.
PL/pgSQL function can be declared to return a "set" (or table) of any data type that can be returned as a single instance. Such a function generates its output by executing RETURN NEXT
for each desired element of the result set, or by using RETURN QUERY
to output the result of evaluating a query.
Structure of PL/pgSQL
Functions written in PL/pgSQL are defined to the server by executing CREATE FUNCTION
commands. Such a command would normally look like:
CREATE FUNCTION somefunc(integer,text) RETURNS integer
AS 'function body trext'
LANGUAGE plpgsql;
The function body is simply a string literal s far as CREATE FUNCTION
is concerned. It is often helpful to use dollar quoting to write the function body, rather than the normal single quote syntax.
PL/pgSQL is a block-structured language. The complete text of a function body must be a block. A block is defined as:
[ <<label>> ]
[ DECLARE
declarations ]
BEGIN
statements
END [ label ];
Each declaration and each statement within a block is terminated by a semicolon. A block that appears within another block must have a semicolon after END
, as shown above, however the final END
that concludes a function body does not require a semicolon.
A label
is only needed if you want to identify the bock for use in an EXIT
statement, or to qualify the names of the variables declared in the block. If a label is given after END
, it must match the label at the block's beginning.
All keywords are case-sensitive. Identifiers are implicitly converted to lower case unless double-quoted, just as they are in ordinary SQL commands. Comments work the same way in PL/pgSQL code as in ordinary SQL.
Any statement in the statement section of a block can be a subblock. Subblocks can be used for logical grouping or to localize variables to a small group of statements. Variables declared in a subblock mask any similarly-named variables of outer blocks created for the duration of the subblock; but you can access the outer variables anyway if you qualify their names with their block's label.
CREATE FUNCTION somefunc() RETURNS integer AS $$
<< outerblock >>
DECLARE
quantity integer := 30;
BEGIN
RAISE NOTICE 'Quantity here is %', quantity; -- Prints 30
quantity := 50;
--
-- Create a subblock
--
DECLARE
quantity integer := 80;
BEGIN
RAISE NOTICE 'Quantity here is %', quantity; -- Prints 80
RAISE NOTICE 'Outer quantity here is %', outerblock.quantity; -- Prints 50
END;
RAISE NOTICE 'Quantity here is %', quantity; -- Prints 50
RETURN quantity;
END;
$$ LANGUAGE plpgsql;
It is important to not confuse the use of BEGIN
/END
for grouping statements in PL/pgSQL with the similarly named SQL commands for transaction control. PL/pgSQL's BEGIN
/END
are only for grouping; they no not start or end a transaction. A block containing an EXCEPTION
clause effectively forms a sub-transaction that can be rolled back without affecting the outer transaction.
Declarations
All variables used in block must be declared in the declarations section of the block. The only exceptions are that the loop variable for a FOR
loop iterating over a range of integer values is automatically declared as an integer variable.
PL/pgSQL variables can have any SQL data type, such as integer
, varchar
, and char
.
user_id integer;
quantity numeric(5);
url varchar;
myrow tablename%ROWTYPE;
myfield tablename.columnname%TYPE;
arow RECORD;
quantity integer DEFAULT 32;
url varchar := 'http://mysite.com';
transaction_time CONSTANT timestamp with time zone := now();
Once declared, a variable's value can be used in later initialized expressions in the same block:
DECLARE
x integer := 1;
y integer := x + 1;
The general syntax of a variable declaration is:
name [ CONSTANT ] type [ COLLATE collation_name ] [ NOT NULL ] [ { DEFAULT | := | = } expression ];
The DEFAULT
clause, if given, specifies the initial value assigned to the variable when the block is entered. If the DEFAULT
clause if not given then the variable is initialized to the SQL null value. The CONSTANT
option prevents the variable from being assigned to after initialized, so that its value will remain constant for the duration of the block. The COLLATE
option specifies a collation to use for the variable. If NOT NULL
is specified, an assignment of a null value results in a runtime error. A variable's default value is evaluated and assigned to the variable each time the block is entered (not just once per function call),
Declaring Function Parameters
Variables passed to functions are named with the identifiers $1
, $2
, etc. Optionally, aliases can be declared for $n
parameter names for increased readability. Either the parameter of the numeric identifier can then be used to refer to the parameter value.
Two ways to create an alias. The preferred way is to give a name to the parameter in the CREATE FUNCTION
comamnd:
CREATE FUNCTION sales_tax(subtotal real) RETURNS real AS $$
BEGIN
RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;
The other way is to explicitly declare an alias:
CREATE FUNCTION sales_tax(real) RETURNS real AS $$
DECLARE
subtotal ALIAS FOR $1;
BEGIN
RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;
When a PL/pgSQL function is declared with output parameters, the output parameters are given $n
names and optional aliases in just the same way as the normal input parameters. An output parameter is effectively a variable that starts out as NULL
; it should be assigned to during the execution of the function. The final value is what is returned.
CREATE FUNCTION sales_tax(subtotal real, OUT tax real) AS $$
BEGIN
tax := subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;
CREATE FUNCTION sum_n_product(x int, y int, OUT sum int, OUT prod int) AS $$
BEGIN
sum := x + y;
prod := x * y;
END;
$$ LANGUAGE plpgsql;
In a call to procedure, all the parameters must be specified. For output parameters, NULL
may be specified when calling the procedure from plain SQL:
CALL sum_n_product(2, 4, NULL, NULL);
sum | prod
-----+------
6 | 8
Another way to declare a PL/pgSQL function is with RETURNS TABLE
:
CREATE FUNCTION extended_sales(p_itemno int)
RETURNS TABLE(quantity int, total numeric) AS $$
BEGIN
RETURN QUERY SELECT s.quantity, s.quantity * s.price FROM sales AS s
WHERE s.itemno = p_itemno;
END;
$$ LANGUAGE plpgsql;
Polymorphic Types:
CREATE FUNCTION add_three_values(v1 anyelement, v2 anyelement, v3 anyelement,
OUT sum anyelement)
AS $$
BEGIN
sum := v1 + v2 + v3;
END;
$$ LANGUAGE plpgsql;
Alias
newname ALIAS FOR oldname;
The ALIAS
syntax is more general than is suggested in the previous section: you can declare an alias for any variable, not just function parameters. The main practical use for this is to assign a different name for variables with predetermined names.
Copying Types
name table.column%TYPE
name variable%TYPE
-- Example, declaring variable with the same data type as users.user_id
user_id users.user_id%TYPE
-- Creating an array declaration with the referenced type:
user_ids users.user_id%TYPE[];
%TYPE
provides the data type of a table column or a previously-declared PL/pgSQL variable. You can use this to declare variables that will hold database values.
Just as when declaring table columns that are arrays, it doesn't matter whether you write multiple bracket pairs or specific array dimensions: PostgreSQL treats all arrays of a given element type as the same type, regardless of dimensionality.
Row Types
name table_name%ROWTYPE;
name composite_type_name;
A variable of a composite type is called a row variable (or row-type variable). Such a variable can hold a whole row of SELECT
or FOR
query result, as long as the query's column set matches the declared type of the variable. The individual fields of the row value are accessed using the usual dot notation, rowvar.field
.
Parameters to a function can be composite types (complete table rows).
Record Types
name RECORD;
Record variables are similar to row-type variables, but they have no predefined structure. They take on the actual row structure of the row they are assigned during a SELECT
or FOR
command.
Expressions
All expressions used in PL/pgSQL statements are processed using the server's main SQL executor. If you write a PL/pgSQL statement like:
IF expression THEN ...
PL/pgSQL will evaluate the expression by feeding a query like:
SELECT expression
Basic Statements
Assignment
An assignment of a value to a PL/pgSQL variable is written as:
variable { := | = } expression;
-- Examples
tax := subtotal * 0.06;
my_record.user_id := 20;
my_array[j] := 20;
my_array[1:3] := array[1,2,3];
complex_array[n].realpart = 12.3;
The expression in such a statement is evaluated by means of an SQL SELECT
command sent to the main database engine. The expression must yield a single value (possibly a row value, if the variable is a row or record variable). The target variable can be a single variable (optionally qualified with a block name), a field of a row or record target, or an element or slice of an array target.
If the expression's result data type doesn't match the variable's data type, the value will be coerced as through by an assignment cast.
Executing SQL Commands
Any SQL command that does not return rows can be executed within a PL/pgSQL function just by writing the command. If the command does return rows (for example SELECT
, or INSERT
/UPDATE
/DELETE
/MERGE
with RETURNING
), there are two ways to proceed.
- When the command will return at most one row, or you only care about the first row of the output, write the command as usual but add an
INTO
clause to capture the output. - To process all of the output rows, write the command as the data source for a
FOR
loop.
Typically you'll want to use varying data values or even different tables at different times. There are two ways to proceed depending on the situation:
- PL/pgSQL variable vales can be automatically inserted into optimizable SQL commands, which are
SELECT
,INSERT
,UPDATE
,DELETE
, `MERGE
, and certain utility commands that incorporate one of these, such asEXPLAIN
andCREATE TABLE ... AS SELECT
. In these commands, any PL/pgSQL variable name appearing in the command text is replaced by a query parameter, and then the current value of the variable is provided as the parameter at run time. - Non-optimizable SQL commands (also called utility commands) are not capable of accepting query parameters. So automatic substitution of PL/pgSQL variables does not work in such commands. To include non-constant text in a utility command executed from PL/pgSQL, you must build the utility command as a string and then
EXECUTE
it.
Sometimes it is useful to evaluate an expression or SELECT
query but discard the result, for example when calling a function that has side-effects but no useful result value. To do this in PL/pgSQL, use the PERFORM
statement.
PERFORM query;
This executes the query
and discards the result. Write the query
the same way you would write an SQL SELECT
command, but replace the initial keyword SELECT
with PERFORM
. For WITH
queries, use PERFORM
and then place the query in parentheses.
Executing a Command with a Single-Row Result
The result of an SQL command yielding a single row (possibly of multiple columns) can be assigned to a record variable, row-type variable, or list of scalar variables. This is done by writing the base SQL command and adding a INTO
clause.
SELECT select_expressions INTO [STRICT] target FROM ...;
INSERT ... RETURNING expressions INTO [STRICT] target;
UPDATE ... RETURNING expressions INTO [STRICT] target;
DELETE ... RETURNING expressions INTO [STRICT] target;
MERGE ... RETURNING expressions INTO [STRICT] target
where target can be a record variable, a row variable, or a comma-separated list of simple variables and record/row fields. PL/pgSQL variables will be substituted into the rest of the command (that is, everything but the INTO
clause) just as described above, and the plan is cached in the same way. Except for the INTO
clause, the SQL command is the same as it would be written outside of PG/pgSQL.
If STRICT
is not specified in the INTO
clause, then target
will be set to the first row returned by the command, or to nulls if the command returned no rows. If the STRICT
option is specified, the command must return exactly one row or a run-time error will be reported, either NO_DATA_FOUND
(no rows) or TOO_MANY_ROWS
(more than one row).
BEGIN
SELECT * INTO STRICT myrec FROM emp WHERE empname = myname;
EXCEPTION
WHEN NO_DATA_FOUND THEN
RAISE EXCEPTION 'employee % not found', myname;
WHEN TOO_MANY_ROWS THEN
RAISE EXCEPTION 'employee % not unique', myname;
END;
Successful execution of a command with STRICT
always sets FOUND
to true
.
For INSERT/UPDATE/DELETE/MERGE
with RETURNING
, PL/pgSQL reports an error for more than one returned row, even when STRICT
is not specified. This is because there is no option such as ORDER BY
with which to determine which affected row should be returned.
Executing Dynamic Commands
To handle problems where you want to generate dynamic commands inside your PL/pgSQL functions, the EXECUTE
statement is provided:
EXECUTE command-string [ INTO [STRICT] target ] [ USING expression [, ... ] ];
where command-string is an expression yielding a string (of type text
) containing the command to be executed. The optional target is a record variable, a row variable, or a comma-separated list of simple variables and record/row fields, into which the results of the command will be stored.
Obtaining the Result Status
There are several ways to determine the effect of a command. The first method is to use the GET DIAGNOSTICS
command, which has the form:
GET [ CURRENT ] DIAGNOSTICS variable { = | := } item [ , ... ];
This command allows retrieval of system status indicators. CURRENT
is a noise word. Each item
is a key word identifying a status value to be assigned to the specific variable
. The current available status items are shown below. Colon-equal can be used instead of the SQL-standard = token.
GET DIAGNOSTICS integer_var = ROW_COUNT;
Name | Type | Description |
---|---|---|
|
| the number of rows processed by the most recent SQL command |
|
| line(s) of text describing the current call stack |
|
| OID of the current function |
The second method to determine the effects of a commands to check the special variable named FOUND
, which is the type of boolean
. FOUND
starts out false within each PL/pgSQL function call. It is set by each of the following types of statements:
SELECT INTO
statement setsFOUND
true if a row is assigned, false if no row is returned- A
PERFORM
statement setsFOUND
true if it produces (and discards) one or more rows, false if no row is produced UPDATE
,INSERT
,DELETE
, andMERGE
statements setFOUND
true if at least one row is affected, false if no row is affected- A
FETCH
statements setsFOUND
true if it returns a row, false if no row is returned - A
MOVE
statement setsFOUND
true if it successfully repositions the cursor, false otherwise - A
FOR
orFOREACH
statement setsFOUND
true if it iterates one ore more times, else false.FOUND
is set this way when the loop exits; inside the execution of the loop,FOUND
is not modified by the loop statement, although it might be changed by the execution of other statements within the loop body. RETURN QUERY
andRETURN QUERY EXECUTE
statements setFOUND
true if the query returns at least one row, false if no row is returned.
Doing Nothing at All
Sometimes a placeholder statement that does nothing is useful. For example, it can indicate that one arm of an if/then/else chain is deliberately empty.
BEGIN
y := x / 0;
EXCEPTION
WHEN division_by_zero THEN
NULL; -- ignore the error
END;
I am going to continue taking notes on PL/pgSQL in the Notes page. See this article for the rest of the notes (I should be finished soon).
Comments
You have to be logged in to add a comment
User Comments
There are currently no comments for this article.