Write Ahead Logs
Write Ahead Logs were a component of database systems that was mentioned a lot in "Designing Data Intensive Applications". I want to return to this concept to remind myself about how they operate.
References
Definitions
- Stable Storage
- Classification of computer data storage technology that guarantees atomicity for any given write operation and allows software to be be written that is robust against some hardware and power failures.
- Atomicity
- Term used in database systems, one of the letters in the ACID acronym describing transaction properties.
- An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or none occur.
- Durability
- Term used in database systems, one of the letters in the ACID acronym describing transaction properties.
- Durability is the ACID property that guarantees that the effects of transactions that have been committed survive permanently, even in case of failures, including incidents and catastrophic events.
- In-Place Updates
- Algorithm that operates directly on the input data structure without requiring extra space proportional to the input size.
- It modifies the input in place, without creating a separate copy of the data structure.
Notes
In computer science, write-ahead logging (WAL) is a family of techniques for providing atomicity and durability (two of the ACID properties) in database systems.
- A write ahead log is an append-only auxiliary disk-resident structure used for crash and transaction recovery. The changes are first recorded in the log, which must be written to stable storage, before the changes are written to the database.
- Main functionality of write-ahead-log:
- Allow the page cache to buffer updates to disk-resident pages while ensuring durability semantics in the larger context of a database system
- Persist all operations on disk until the cached copies of pages affected by these operations are synchronized on disk. Every operation that modifies the database state has to be logged on disk before the contents on the associated pages can be modified.
- Allow lost in-memory changes to be reconstructed from the operation log in case of a crash.
- In a system with WAL, all modification are written to a log before they are applied. Usually, both undo and redo information is stored in the log.
- After a certain amount of operations, the program should perform a checkpoint, writing all the changes specified in the WAL to the database and clearing the log,
- WAL allows updates of a database to be done in-place. The main advantage of doing updates in-place is that it reduces the need to modify indexes and block lists.
- Modern file systems typically use a variant of WAL for at least file system metadata; this is called journaling.
WALs in PostgreSQL
- WALs are strictly sequential.
- PostgreSQL WAL are located
/var/lib/pgsql/<version>/data/pg_wal
- How to control WAL?
wal_keep_log
- Specifies the minimum number of past log file segments kept in thepg_xlog
directorymax_wal
- Maximum size to let the WAL grow as soft limit
- Copying out generated WAL files is called archiving
- WAL Level
- minimal WAL Level
- information needed to recover from a crash or immediate action
- archive
- enough information to allow the archival of WAL files
- replica
- information required to run rad-only queries on a standby server
- logical
- Extract logical change sets from WAL