Machine Learning

I want to go through the Wikipedia series on Machine Learning and Data mining. Machine learning (ML) is a subfield of artificial intelligence that focuses on the development of algorithms capable of learning from data and generalizing to unseen data, allowing tasks to be performed without explicit instructions
Date Created: 04/30/2024
2 1315
References
Machine Learning Wikipedia Article

Definitions
generalization error
For supervised learning applications in machine learning and statistical learning theory, generalization error is a measure of how accurately an algorithm is able to predict the outcome values of previously unseen error. Generalization error can be minimized by avoiding overfitting in the learning algorithm
probably approximately correct Learning
In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. 
In this framework, the learner receives samples and must select a generalization function (called the hypothesis) from a certain class of possible functions. The goal is that with high probability, the selected function will have a low generalization error.
An important innovation of the PAC framework is the introduction of computational complexity theory concepts to machine learning. In particular, the learner is expected to find efficient functions, and the learner must implement an efficient procedure.
posterior probabilities
A posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. The posterior probability contains everything there is to know about an uncertain proposition, given prior knowledge and a mathematical model describing the observations available at a particular time.
prior probability distribution
A prior probability distribution of an uncertain quantity, often simply called the prior is its assumed probability distribution before some evidence is taken into account.

Notes
Machine Learning (ML) is a subfield of artificial intelligence that focuses on the development of algorithms capable of learning from data and generalizing to unseen data, allowing tasks to be performed without explicit instructions. Advances in deep learning, beginning in the 2010s, enabled neural networks to achieve superior performance compared to many previous approaches. Statics and mathematical optimization methods comprise the foundations of machine learning. Data mining is a related field of study focusing on exploratory data analytics via unsupervised learning. From a theoretical viewpoint, probably approximately correct (PAC) learning provides a framework for describing machine learning.
The term machine learning was coined in 1959 by Arthur Samuel, an IBM employee and pioneer in the field of computer gaming and artificial intelligence. In 1949, Donald Hubb published the book The Organization of Behavior in which he introduced a theoretical neural structure formed by certain interactions among nerve cells. Tom M Mitchell provided a widely quoted, more formal definition of the algorithms studied in the machine learning field: "A computer program is said to learn from experience E with respect to some set of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E". Modern day machine learning has two objectives:
Classify data based on models which have been developed
Make predictions for future outcomes based on these models.  
As a scientific endeavor, machine learning grew out of the quest for artificial intelligence (AI). 
Machine Learning kind of shifted away from AI in the 1980s with the rise of expert systems.  Machine Learning because a field more focused on the statistical line of research. Machine Learning flourished in the 90s, focusing on methods and models borrowed from statistics, fuzzy logic, and probability theory.
There is a close connection between machine learning and compression. A system that predicts the posterior probabilities of a sequence given its entire history can be used for optimal data compression. This equivalence has been used as a justification for using data compression as a benchmark for "general intelligence".
Machine learning and data mining often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on known properties learned from the training data, data mining focuses on the discovery of previously unknown properties in the data. Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy.
Machine learning has intimate times to optimization. Many learning problems are formulated as minimization of some loss function on a training set of examples. Machine learning and statistics are closely related fields with different goals: statistics draws population inferences from a sample, while machine learning finds generalizable predictive patterns. 
The core objective of a learner is to generalize from its experience. Generalization in this context is the ability of a learning machine to perform accurately on new, unseen examples/tasks after having experienced a learning data set.  The bias-variance decomposition is one way to quantify generalization error. 
Approaches
Supervised Leaning: The computer is presented with example inputs and their desired outputs, given by a "teacher", and the goal is to learn a general rule that maps inputs to outputs.
Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to find the structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (feature learning)
Reinforcement learning: A computer program interacts with a dynamic environment in which it must perform a certain goal (such as driving a vehicle or playing a game against a component). As it navigates its problem space, the program is provided feedback that's analogous to rewards, which it tries to maximize. 
A machine learning model is a type of mathematical model that, after being "trained" on a given dataset, can be used to make predictions or classifications on new data. During training, a learning algorithm iteratively adjusts the model's internal parameters to minimize errors in its predictions. Picking the best model for a task is called model selection. 
Although machine learning has been transformative in some fields, machine learning programs often fail to deliver expected results. Reasons for this are numerous: lack of (suitable) data, lack of access to data, data bias, privacy problems, badly chosen tasks and algorithms, wrong tools and people, lack of resources, and evaluation problems. Some other limitations:
bias
explainability
overfitting
Classification of machine learning models can be validation by accuracy estimation techniques like the holdout method, which splits the data into test and training set and evaluates the performance of the training model on the test set. In comparison, the K-fold-cross-validation method randomly partitions the data into K subsets and then K experiments are performed each respectively considering 1 subset for evaluation and the remaining K-1 subsets for training the model. 
Machine Learning

References

Definitions

Notes

Comments

User Comments