Unsupervised Learning

I want to go through the Wikipedia series on Machine Learning and Data mining. Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

Date Created:

0 682

References

Unsupervised Learning Wikipedia Article

Notes

Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, where a small portion of the data is tagged, and self-supervision, Some researchers consider self-supervised learning a form of unsupervised learning.

Tasks are often categories as discriminative (recognition) or generative (imagination). Often but not always, discriminative tasks use supervised methods and generative tasks use unsupervised.

Neural Network Architectures

Training

During the learning phase, an unsupervised network tries to mimic the data it's given and uses the error in its mimicked output to correct itself (correct its weights and biases). Sometimes the error is expressed as a low probability that the erroneous output occurs, or it might be expressed as an unstable high energy state in the network.

Energy

An energy function is a macroscopic measure of a network's activation state. In Boltzmann machines, it plays the role of the cost function. This analogy with physics is inspired by Ludwig Boltzmann's analysis of a gas' macroscopic energy from microscopic probabilities of particle motion , where is the Boltzmann constant and T is temperature.

Networks

Various networks are used in unsupervised learning algorithms.

Probabilistic Methods

Two of the main methods used in unsupervised learning are principal component and cluster analysis. Cluster analysis is used in unsupervised learning to group, or segment, datasets with shared attributes in order to extrapolate algorithmic relationships. Cluster analysis is a branch of machine learning that groups the data that has not been labelled, classified, or categorized. Instead of responding to feedback, cluster analysis identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data.

A central application of unsupervised learning is in the field of density estimation in statistics, though unsupervised learning encompasses many other domains involving summarizing and explaining data features. Density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function.

Some of the most common algorithms used in unsupervised learning include:

Clustering
Anomaly Detection
Approach for learning latent variable models