from Machine Learning to Machine Reasoning

From statwiki
Revision as of 20:21, 5 November 2015 by Trttse (talk | contribs)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Introduction

Learning and reasoning are both essential abilities associated with intelligence and machine learning and machine reasoning have received considerable attention given the short history of computer science. The statistical nature of machine learning is now understood but the ideas behind machine reasoning is much more elusive. Converting ordinary data into a set of logical rules proves to be very challenging: searching the discrete space of symbolic formulas leads to combinatorial explosion (cite). Algorithms for probabilistic inference (cite) still suffer from unfavourable computational properties (cite). Algorithms for inference do exist but they do however, come at a price of reduced expressive capabilities in logical inference and probabilistic inference.

Humans display neither of these limitations.

The ability to reason is the not the same as the ability to make logical inferences. The way that humans reason provides evidence to suggest the existence of a middle layer, already a form of reasoning, but not yet formal or logical. Informal logic is attractive because we hope to avoid the computational complexity that is associated with combinatorial searches in the vast space of discrete logic propositions.

It turns out that deep learning and multi-task learning show that we can leverage auxiliary tasks to help solve a task of interest. This idea can be interpreted as a rudimentary form of reasoning.

Auxiliary tasks

In order to consider the relevance of an auxiliary task, let us consider the task of of identifying person from face images. It remains expensive to collect and label millions of images representing the face of each subject with a good variety of positions and contexts. However, it is easier to collect training data for a slightly different task of telling whether two faces in images represent the same person or not (cite): two faces in the same picture are likely to belong to two different people; two faces in successive video frames are likely to belong to the same person. These two tasks have much in common image analysis primitives, feature extraction, part recognizers trained on the auxiliary task can help solve the original task.

Figure below illustrates the a transfer learning strategy involving three trainable models. The preprocessor P computes a compact face representation of the image and the comparator labels the face. We assume two preprocessors P and one comparator D and train this model with abundant labels for the auxiliary task. Then we assemble another instance of P with classifier C and train the resulting model using a restrained number of labelled examples from the original task.

Reasoning revisited