Optimization Theory
Optimization Theory is a topic that comes up a lot in Machine Learning, so I want to learn more about it.
Mathematical optimization or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available criteria. It is generally divided into two subfields: discrete optimization and continuous optimization. [...] In the more general approach, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function.
- An optimization problem with discrete variables is known as discrete optimization, in which an object such as an integer, permutation or graph must be found from a countable set
- In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers.
- A problem with continuous variables is known as a continuous optimization, in which optimal arguments from a continuous set must be found
An optimization problem can be represented as:
Given: A function from some set to the real numbers
Sought: an element such that for all (minimization
) or such that for all (maximization
).
The domain of is called the search space or the choice set, while the elements of are called candidate solutions or feasible solutions.
The function is variously called an objective function, criterion function, loss function, cost function (minimization), utility function or fitness function (maximization), or, in certain fields, energy function or energy functional. A feasible solution that minimizes (or maximizes) the objective function is called an optimal solution.
While a local minimum is at least as good as any nearby elements, a global minimum is at least as good as every feasible element. Generally, unless the objective function is convex in a minimization problem, there may be several local minima. In a convex problem, if there is a local minimum that is interior (not on the edge of the set of feasible elements), it is also the global minimum, but a nonconvex problem may have more than one local minimum not all of which need be global minima.