Random Forest

I want to go through the Wikipedia series on Machine Learning and Data mining. Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

Date Created:
0 79

References



Notes


Random forests or random decision forests is an ensemble learning method for classification, regression, and other tasks that works by creating a multitude of decision trees during training. For classification tasks, the output of the random forest is the class selected by most trees. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the output is the average of the predictions of the trees. Random forests, correct for decision trees' habit of overfitting to their training set.

Trees that grow very deep tend to learn highly irregular patterns: they overfit their training sets - have low bias and high variance. Random forests are a way of averaging multiple deep decision trees, trained on different parts of the same training set, with the goal of reducing the variance. The training algorithm for random forests applies the general technique of bootstrap aggregating, or bagging, to tree learners.



You can read more about how comments are sorted in this blog post.

User Comments