Logging

Taking some notes on how to build logs for applications. I don't have a good logging system yet, and it can get hard when you have distributed microservices.

Date Created:

Last Edited:

2 522

References

The Log
- Everything you ever wanted to know about structured logs, and how to build distributed systems on top of them. - Greg Brockman
Logging best practices
- A short guide for how to think about logging. I wish more software followed this article's advice. - Greg Brockman
- This link now returns 404
Logging Cheat Sheet / Logging Vocabulary
- OWASP Blogs on Logging

The Log

Sometimes called write-ahead logs or commit logs or transaction logs, logs have been around almost as long as computers and are at the heart of many distributed data systems and real-time application architectures. You can't fully understand databases, NoSQL stores, key value stores, replication, paxos, hadoop, version control, or almost any software system without understanding logs; and yet, most software engineers are not familiar with them.

What is a Log?

A log is an append-only totally-ordered sequence of records ordered by time.

Log

Records are appended to the end of the log, and reads proceed left-to-right. Each entry is assigned a unique sequential log entry number. The ordering of records denotes a notion of "time" since entries to the left are defined to be older than entries to the right. Logs have a specific purpose: they record what happened and when.

Logs in Databases

The usage of logs in databases has to do with keeping in sync the variety of data structures and indexes in the presence of crashes. To make this atomic and durable, a databases uses a log to write out information about the records it will be modifying, before applying the changes to all the various data structures it maintains. The log is used as an authoritative source in restoring all other persistent structures in the event of a crash.

Logs in Distributed Systems

State Machine Replication Principle:

If two identical, deterministic processes begin in the same state and get the same inputs in the same order, they will produce the same output and end in the same state.

Deterministic means that the processing isn't timing dependent and doesn't let any other out of band input influence its results. The state of the process is whatever data remains on the machine, either in memory or on disk, at the end of processing.

Different groups of people seem to describe the uses of logs differently. Database people generally differentiate between physical and logical logging. Physical logging means logging the contents of each row that is changed. Logical logging means logging not the changed rows but the SQL commands that lead to the row changes (the insert, update, and delete statements).

The distributed log can be seen as the data structure which models the problem of consensus.

What the Log is Good For

Data Integration - making all of an organization's data easily available in all its storage and processing systems
Real-time data processing - computing derived data streams
Distributed system design - how practical systems can be simplified with a log-centric design

Data Integration

Data integration is making all the data that an organization owns available in all its services and systems.

Event data records things that happen rather than things that are.

United Log

The data warehouse is meant to be a repository of the clean, integrated data structured to support analysis. The data warehousing methodology involves periodically extracting data from source databases, munging it into some kind of understandable form, and loading it into a central data warehouse. Having this central location that contains a clean copy of all your data is a hugely valuable asset for data-intensive analysis and processing.

At LinkedIn, we have built our event data handling in a log-centric fashion. We are using Kafka as the central, multi-subscriber event log. We have defined several hundred event types, each capturing the unique attributes about a particular type of action. This covers everything from page views, ad impressions, and searches, to service invocations and application exceptions.

Stream Processing is just processing which includes a notion of time in the underlying data being processed and does not require a static snapshot of the data so it can produce output at a user-controlled frequency instead of waiting for the "end" of the data set to be reached.

OWASP Recommendations

This is a guide to application logging mechanisms, especially related to security logging. Application event logging often provides much greater insight than infrastructure (e.g., database) logging alone. Application logging should be consistent within the application, consistent across an organization's application portfolio and use industry standards where relevant, so the logged event data can be consumed, correlated, analyzed, and managed by a wide variety of systems.

Purpose

Application logs should be used for:

Identifying security incidents
Monitoring policy violations
Establishing baselines
Assisting non-repudiation controls
Providing information about problems and unusual conditions
Contributing additional application-specific data for incident investigation which is lacking in other log sources
Helping defend against vulnerability identification and exploitation through attack detection

Design, Implementation and Testing

Sources of data

Security events
Business process monitoring
Anti-automation monitoring
Audit trails
Performance monitoring
Data for subsequent requests for information

The degree of confidence in the event information has to be considered when including event data from systems in a different trust zone.

Where to Record Data

Applications commonly write event log data to the file system or a database (SQL or NoSQL). When using a database to keep logs, it is preferable to utilize a separate database account that is only used for writing log data and which has very restrictive database, table, function, and command permissions.

Which Events to Log

Always Log
- Input validation failures
- Output validation failures
- Authentication successes and failures
- Session management failures
- Application errors and system events
- Application and related system start-ups and shut-downs, and logging initialization
- Use of higher-risk functionality including:
- - User administrator actions
  - Use of systems administration privileges
  - Use of default of shared accounts
  - Access to sensitive data
  - Encryption activities such as use or rotation of cryptographic keys
  - Creation or deletion of system-level objects
  - Data import and export
  - Submission and processing of user generated content - especially file uploads
Optionally Log
- Sequencing failure
- Excessive Use
- Data changes
- Fraud
- Suspicious, unacceptable, or unexpected behavior
- Modifications to configuration
- Application code file and/or memory changes

Event Attributes

The application logs must record when, where, who, and what for each event.

When
- Log date and time
- Event date and time
Where
- Application Identifier (name and version)
- Application Address
- Service
- Geolocation
- Window/form/page
- Code location
Who (human or machine user)
- Source address (user's IP address, user's device / machine identifier)
What
- Type of Event
- Severity of Event
- Security Flag
- Description
Other Info:
- Include other information that you think might be useful

Data to Exclude

Application source Code
Session ID
Access Tokens
Sensitive Personal Data and Some forms of personal identifiable information
Authentication passwords
DB connection strings
Encryption keys
Bank account info and payment card holder data
Other sensitive information

Example of Logged Event:

{
    "datetime": "2021-01-01T01:01:01-0700",
    "appid": "foobar.netportal_auth",
    "event": "AUTHN_login_success:joebob1",
    "level": "INFO",
    "description": "User joebob1 login successfully",
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36",
    "source_ip": "165.225.50.94",
    "host_ip": "10.12.7.9",
    "hostname": "portalauth.foobar.com",
    "protocol": "https",
    "port": "440",
    "request_uri": "/api/v2/auth/",
    "request_method": "POST",
    "region": "AWS-US-WEST-2",
    "geo": "USA"
}