Machine Super Intelligence

I am reading this paper because it was recommended as part of Ilya Sutskever's approx. 30 papers that he recommended to John Carmack to learn what really matters for machine learning / AI today. The goal of this thesis is to explore some of the open issues surrounding universal artificial intelligence.

Reference Link to PDF

DOWNLOAD TEX

Date Created: 41 18, 2024

Last Edited: 16 31, 2024

1 172

0.1 References

Machine Super Intelligence

Inductive Reasoning is any of various methods of reasoning in which broad generalizations of principles are dericed from a body of observations. The truth of the conclusion of an inductive argument is at best probable, based upon the evidence given.
Solomonoff’s theory of inductive inference proves that, under its common sense assumptions (axioms), the best possible scientific model is the shortest algorithm that generates the empirical data under consideration. In addition to the choice of data, other assumptions are that, to avoid the post-hoc fallacy, the programming language must be chosen prior to the data and that the environment veing observed is generated by an unknown algorithm. This is called the theory of induction.
incomputable - In computability theory and computational complexity theory, an undecidable problem is a decision problem for which it is proved to be impossible to construct an algorithm that always leads to a correct yes-or-no answer. The halting problem is an example: it can be proven that there is no algorithm that correctly determined whether an arbitrary program eventually halts when run.

0.3 Preface

This thesis concerns the optimal behavior of agents in unknown computable environments, also known as universal artificial intelligence. These theoretical agents are able to learn to perform optimally in many types of environments. That universal artificial intelligence can be defined at all depends on the assumption of infinite computational resources - which is not practical and the reason this can not be implemented in the real world. The main use of universal artificial intelligence theory thus far has been as a theoretical tool with which to mathematically study the properties of machine super intelligence.

The foundations fo universal intelligence date back to the origins of philosophy and inductive inference. Universal artificial intelligence proper started with the work of Ray J. Solomonoff in the 1960s. Solomonoff was considering the problem of predicting binary sequences. What he discovered was a formulation for an inductive inferencing system that can be proven to very rapidly learn to optimally predict any sequence that has a computable probability distribution. Solomonoff’s model is a kind of grand unified theory of inductive inference. The main theoretical limitation of Solomonoff induction is that it only addresses the problem of passive inductive learning: whether the agent’s predictions are correct has no effect on the future observed sequence.

In the more general active case the agent is able to take actions which may affect the observed future.In the late 1990s, Marcus Hutter extended Solomoff’s passive inductive model to the active case by combining it with sequential decision theory. This produced a theory of universal agents, and in particular a universal agent for a very general class of interactive environments, known as the AIXI agent. Hutter was able to prove that the behavior of universal agents converges to optimal in any setting where this is at all possible for a general agent, and that these agents are Pareto optimal in the sense that no agent can perform as well in all environments and strictly better in at least one. These are the strongest known results for a completely general purpose agent. Given that AIXI has such generality and extreme performance characteristics, it cna be a theoretical model of a super intelligent agent.

The goal of this thesis is to explore some of the open issues surrounding universal aritifical intelligence.

0.4 Nature and Measurement of Intelligence

The difficulty of forming a highly general notion of intelligence is readily apparent. Any proposed concept definition of intelligence must encompass the essence of human intelligence, as well as other possibilities in a consistent way. It should not be limited to any particular set of senses, environments, or goals. Nor should it be limited to any specific kind of hardware. It should be based on principles which are fundamental and thus unlikely to alter over time. There are many different definitions of intelligence. A common feature of many definitions is that intelligence is seen as a property of an individual who is interacting with an external environment, problem, or situation. Another common feature is that an individual’s intelligence is related to their ability to succeed or profit. This implies the existence of some kind of objective or goal. The strong emphasis on learning, adaptation and experience in many definitions implies that the environment is not fully known to the individual and may contain new situations that could not have been anticipated in advance. Thus intelligence is not the ability to deal with a fully known environment. What we believe to be the essence of intelligence in its most general form:

Intelligence measures an agent’s ability to achieve goals in a wide range of environments.

There are many properties that a good test of intelligence should have. One important property is that the test should be repeatable, in the sense that it consistently returns about the same score for a given individual. An intelligence test should be valud in the sense that it appears to be testing what it claims to be testing for. A test should also have predictive power. static tests are tests that test an individual’s knowledge and ability to solve one-off problems. They do not directly measure the ability to learn and adapt over time. What is needed is a more direct test of an indidual’s ability to learn and adapt: a so called ”dynamic test”.

The turing test: if huamn judges cannot effectively discriminate between a computer and a human through teletyped conversation then we must conclude that the computer is intelligent.

0.5 Universal Artificial Intelligence

Inductive inference is the process by which one observes the world and then infers the causes behind what has been observed. Epicurus’ principle of multiple explanations: Keep all hypotheses that are consistent with the data. Occam’s Razor: Among all hypotheses consistent with the observations, the simplest is the most likely. Bayes’ Rule

P (h | D) = \frac{P (D | h) P (h)}{P (D)} = \frac{P (D | h) P (h)}{\sum_{h^{'} \in H} P (D | h^{'}) P (h^{'})}

This equation is known as Bayes’ Rule. It allows one to compute the probability of different hypotheses $h \in H$ given the observed data $D$ , and a distribution $P (h)$ over $H$ . The probability of the observed data, $P (D)$ , is known as the evidence. $P (h)$ is known as the prior distribution as it is the distribution over the space of hypotheses before taking into account the observed data. The distribution $P (h | D)$ is known as the posterior distribution as it is the distribution after taking the data into account.

This is not immediately beneficial to me and some parts of this are going over my head. I will come back to it when I need to.

Machine Super Intelligence

0.1 References

0.3 Preface

0.4 Nature and Measurement of Intelligence

0.5 Universal Artificial Intelligence

Comments

User Comments

Machine Super Intelligence

0.1 References

0.2 Related

0.3 Preface

0.4 Nature and Measurement of Intelligence

0.5 Universal Artificial Intelligence

Comments

User Comments