Convolution

I was going over neural style transfer and I want to remind myself of how the convolution process works.

Date Created:

References



Definitions


  • Expected Value
    • In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first moment) is a generalization of the weighted average. Informally, the expected value is the mean of the possible values a random variable can take, weighted by the probability of those outcomes. Since it is obtained through arithmetic, the expected value sometimes may not even be included in the sample data set; it is not the value you would expect to get in reality.
    • The expected value of a random variable with a finite number of outcomes is a weighted average of all possible outcomes. In the case of a continuum of possible outcomes, the expectation is defined by integration.
    • The expected value of a random variable is often denoted , , or , with also often stylized as 𝔼 or .
    • Consider a random variable with a finite list of possible outcomes, each of which (respectively) has provability of occurring. The expectation is defined as
  • Cross-Correlation
    • In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a sliding dot product or sliding inner product. It is commonly used for searching a long signal for a shorter, known feature. It has applications in pattern recognition, single particle analysis, electron tomography, averaging, cryptanalysis, and neurophysiology. The cross-correlation is similar in nature to the convolution of two functions. In an autocorrelation, which is the cross-correlation of a signal with itself, there will always be a peak at a log of zero, and its size will be the signal energy.
    • For random vectors and , each containing random elements whose expected value and variance exist, the cross-correlation matrix of and is defined by:
    • and the dimensions . Written component-wise:


Notes


In mathematics (in particular, functional analysis), a convolution is a mathematical operation on two functions ( and ) that produces a third function (). The term convolution refers to both the result function and to the process of computing it. It is defined as the integral of the product of two functions after one is reflected about the y-axis and shifted. The integral is evaluated for all values of shift, producing the convolution function. The choice of which functions is reflected and shifted before the integral does not change the integral result (see commutativity). Graphically, it expresses how the 'shape' of one function is modified by the other.

Visual Comparison of Convolution, Cross-Correlation, and Autocorrelation

Some features of convolution are similar to cross-correlation: for real-valued functions, of a continuous or discrete variable, convolution differs from the cross-correlation only in that either or is reflected about the y-axis in convolution; thus it is a cross correlation of and , or and .

Convolution has applications that include probability, statistics, acoustics, spectroscopy, signal processing and image processing, geophysics, engineering, physics, computer vision and differential equations.

Computing the inverse of the convolution operation is known as deconvolution.


Definition


The convolution of and is written , denoting the operator with the symbol . It s defined as the integral of the product of two functions after one is reflected about the y-axis and shifted. As such, it is a particular kind of integral transform:

At each , the convolution formula can be described as the area under the function weighted by the function shifted by the amount . As changes, the weighting function emphasizes different parts of the input function ; if is a positive value, then is equal to that slides or is shifted along the -axis toward the right (toward +) by the amount , while if is a negative value, then is equal to that slides or is shifted toward the left (toward ) by the amount .


Discrete Convolution


For a complex-valued functions and defined on the set of integers, the discrete convolution of and is given by:

Discrete 2D Convolution Animation


Intuitively Understanding Convolutions for Deep Learning


The 2D convolution: you start with a kernel, which is a small matrix of weights. This kernel "slides" over the 2D input data, performing an elementwise multiplication with the part of the input it is currently on, and then summing up the results input a single output pixel.

Standard Convolution

The kernel repeats this process for every location it slides over, converting a 2D matrix of features into yet another 2D matrix of features. The output features are essentially the weighted sums (with the weights being the values of the kernel itself) of the input features located roughly in the same location of the output pixel on the input layer. The size of the kernel directly determines how many (or few) input features get combined in the production of a new output feature.

Convolutions allow us to "look at" only some input features (in contrast to a fully connected layer, where you look at every input feature).


Commonly Used Techniques


  • Padding
    • In the example about, the outer pixels are never centered by the kernel and the output matrix size is less than the input matrix size. To fix this, we can "pad" the edges with extra, "fake" pixels.

Padding

  • Striding
    • The idea of the stride is to skip some of the slide locations of a kernel. A stride of 1 means to pick slides a pixel apart, so basically every single slide, acting as a standard convolution. More modern networks, such as the ResNet architectures, entirely forgo pooing layers in their internal layers, in favor of stride-d convolutions when needing to reduce their input sizes.

Striding

The Multi-Channel Version


Most images have 3 channels (RGB). It's pretty easy to think of channels as being a "view" of the image as a whole, emphasizing some aspects, de-emphasizing others. In the case of multiple channels, the terms filter and kernel are unique. Each filter actually happens to be a collection of kernels, with there being one kernel for every single input channel to the layer, and each kernel being unique.

Each filter in a convolution layer produces one and only one output channel, and they do it like so:

  1. Each of the kernels of the filter slides over their respective input channels, producing a processed version of each.
  2. Each of the per channel processed versions are then summed together to form one channel. The kernels of a filter each produce one version of each channel, and the filter as a whole produces one output channel.
  3. Finally, the bias term gets added to the output channel to produce the final output channel.



Insert Math Markup

ESC
About Inserting Math Content
Display Style:

Embed News Content

ESC
About Embedding News Content

Embed Youtube Video

ESC
Embedding Youtube Videos

Embed TikTok Video

ESC
Embedding TikTok Videos

Embed X Post

ESC
Embedding X Posts

Embed Instagram Post

ESC
Embedding Instagram Posts

Insert Details Element

ESC

Example Output:

Summary Title
You will be able to insert content here after confirming the title of the <details> element.

Insert Table

ESC
Customization
Align:
Preview:

Insert Horizontal Rule

#000000

Preview:


Insert Chart

ESC

View Content At Different Sizes

ESC

Edit Style of Block Nodes

ESC

Edit the background color, default text color, margin, padding, and border of block nodes. Editable block nodes include paragraphs, headers, and lists.

#ffffff
#000000

Edit Selected Cells

Change the background color, vertical align, and borders of the cells in the current selection.

#ffffff
Vertical Align:
Border
#000000
Border Style:

Edit Table

ESC
Customization:
Align:

Upload Lexical State

ESC

Upload a .lexical file. If the file type matches the type of the current editor, then a preview will be shown below the file input.

Upload 3D Object

ESC

Upload Jupyter Notebook

ESC

Upload a Jupyter notebook and embed the resulting HTML in the text editor.

Insert Custom HTML

ESC

Edit Image Background Color

ESC
#ffffff

Insert Columns Layout

ESC
Column Type:

Select Code Language

ESC
Select Coding Language