Difference between revisions of "from Machine Learning to Machine Reasoning"
|Line 24:||Line 24:|
== Reasoning Systems ==
== Reasoning Systems ==
Revision as of 21:23, 5 November 2015
Learning and reasoning are both essential abilities associated with intelligence and machine learning and machine reasoning have received considerable attention given the short history of computer science. The statistical nature of machine learning is now understood but the ideas behind machine reasoning is much more elusive. Converting ordinary data into a set of logical rules proves to be very challenging: searching the discrete space of symbolic formulas leads to combinatorial explosion (cite). Algorithms for probabilistic inference (cite) still suffer from unfavourable computational properties (cite). Algorithms for inference do exist but they do however, come at a price of reduced expressive capabilities in logical inference and probabilistic inference.
Humans display neither of these limitations.
The ability to reason is the not the same as the ability to make logical inferences. The way that humans reason provides evidence to suggest the existence of a middle layer, already a form of reasoning, but not yet formal or logical. Informal logic is attractive because we hope to avoid the computational complexity that is associated with combinatorial searches in the vast space of discrete logic propositions.
It turns out that deep learning and multi-task learning show that we can leverage auxiliary tasks to help solve a task of interest. This idea can be interpreted as a rudimentary form of reasoning.
In order to consider the relevance of an auxiliary task, let us consider the task of of identifying person from face images. It remains expensive to collect and label millions of images representing the face of each subject with a good variety of positions and contexts. However, it is easier to collect training data for a slightly different task of telling whether two faces in images represent the same person or not (cite): two faces in the same picture are likely to belong to two different people; two faces in successive video frames are likely to belong to the same person. These two tasks have much in common image analysis primitives, feature extraction, part recognizers trained on the auxiliary task can help solve the original task.
Figure below illustrates the a transfer learning strategy involving three trainable models. The preprocessor P computes a compact face representation of the image and the comparator labels the face. We first assemble two preprocessors P and one comparator D and train this model with abundant labels for the auxiliary task. Then we assemble another instance of P with classifier C and train the resulting model using a restrained number of labelled examples from the original task.
Little attention has been paid to the rules that describe how to assemble trainable models that perform specific tasks. However, these composition rules play an extremely important rule as they describe algebraic manipulations that let us combine previously acquire knowledge in order to create a model that addresses a new task.
We now draw a bold parallel: "algebraic manipulation of previously acquired knowledge in order to answer a new question" is a plausible definition of the word "reasoning".
Composition rules can be described with very different levels of sophistication. For instance, graph transformer networks (depicted in the figure below) (cite) construct specific construct specific recognition and training models for each input image using graph transduction algorithms. The specification of the graph transducers then should be viewed as a description of the composition rules.
Graphical models describe the factorization of joint probability distributions into elementary conditional distributions with specific independence assumptions. The probabilistic rules then induce an algebraic structure on the space of conditional probability distributions, describing relations in an arbitrary set of random variables.
We are no longer fitting a simple statistical model to data and instead, we are dealing with a more complex model consisting of (1) an algebraic space of models, and (b) composition rules that establish a correspondence between the space of models and teh space of questions of interest. We call such an object a "reasoning system".
Reasoning systems are unpredictable and thus vary in expressive power, predictive abilities and computational examples. A few examples include:
- First order logic reasoning -
- Probabilistic reasoning -
- Causal reasoning -