A short history of Artificial Intelligence and Neural Network

From Ancient Greece to Deep-Learning

A short history of Artificial Intelligence and Neural Network
January 23, 2018 nschaetti

From Ancient Greece to Bletchley Park

Artificial Intelligence is a scientific domain at the crossroads of several fields : computer science, neurosciences, linguistique, cybernetics and mathematics. It rises many issue, as much technically as philosophically. Indeed, modern computer are able to crush enormous amount of data much faster than any human, but are unable to solve many problems that we, humans, face everyday like real-time object and face recognition in a crowded field, emotion recognition, and mostly, learning and understanding of complexe languages. What can we say then of more advanced skills like artistic or scientific creativity?

As shown in the infographic below, Artistotle (384-322 B.C.) was the first to lay the foundation of this field by defining a set of precise laws governing the rational part of the mind, and created a formal system which, in principle, allow the automatic generation of conclusion from initial premisses.

Artificial Intelligence Timeline

The study of logic goes back to the antiquity, but it is to George Bool (1847) that we owe the mathematical development with his definition of the propositional logique, or Boolean. Then in 1879, Gottlob Frege developed it to create to first order logic.

Two thins are necessary in Artificial Intelligence, a deep understanding of the mechanisms of intelligence, and a machine powerful enough to implement them. Modern computers as we know it emerge in the 40s with the work of Alan Turing. Mathematically first, with his definition of the Turing Machine and of what is computable or not with such machines. And pratically then, with the creation in 1940 of one of the first operational computer build to break Nazi’s code. But one of his most famous work in the field of Artificial Intelligence showed appeared in his 1950 article “Computing Machinery and Intelligence” where he introduced the Turing test, machine learning, genetic algorithm and reinforcement leanring. A the same time, Warren McCulloch and Walter Pitts created the first artificial neuron model.

The ‘Bombe’ code-breaking machine, 1943.

The Artificial Intelligence’s place of birth can be located à the Darthmouth College in the middle of the 50s. John McCarthy and Marvin Minsky organised in 1965 a workshop bringing together several researchers interested in the fields of automata, neural networks and the study of intelligence. The beginnings of intelligence saw the appearance of a current based on the symbolic logic and the application of deductive systems, named computational model, and several softwares able to solve problems and find theorem’s proof were created the following years.

In the 60s, another model made its appearance : the connectionist model where intelligence emerges from networks of simple inter-connected units, which our brain’s neurons are perfect example. In 1962, Frank Rosenblatt defined the perceptron, the most simple kind of neurons and linear classifier.

But the early enthusiasm characterised by a huge confidence of AI researchers were replaced by a sudden back return to realy in mid 60s. First by a scientific report of a consultative committee which concluded that no scientific progress had been made and secondly because no concrete application were on the horizon. The result was the total end of American funding of AI researches. Then, it was the discovery that most problems that AI researchers were working on were intractable that pushed the UK to stop all fundings. Finally, the demonstration by Minsky and Papert that the perceptron was unable to learn simple function, such as the exclusive OR, which definitively put an end to the early enthusiasm. This result reduced fundings in the field of artificial neural network to nothing despite the invention in the late 60s of the back-propagation algorithm.

The mid 80s were the theatre of the comeback of connectionist models, thanks to the rediscovery of the back-propagation algorithm and its successful application to many machine learning problems. These results revived interest of scientists and public authorities.

In the mid 90s, intelligent agents appeared, helped by the growing utilisation of the World Wide Web which allowed the implementation of software bots and search engines. Researches continued on the connectionist model et saw the appearance of new kinds of networks such as Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN) which were the premisses of Deep-Learning.

In early 2000s, with the availability of enormous databases, Reservoir Computing and Deep-Learning revived the interest of the scientific community for neural networks. Machine Learning changed the goals of AI to lead them to tractable problems which can be put quickly into practice.

Today, researches in the field of AI and Machine Learning attract attention of many researchers, as well as that of industry leaders such as Google, Apple or Facebook which are in a string concurrency to create smarter products. But some researches are oriented towards new areas. Neural networks are massively parallels and the computation is divided on an enormous amount of basic units that can go to several hundreds of billions in the case of the human brain. This architecture allows the processing of information but also its storage. The repartition of work allow a kind of tolerance to fault.

We can then ask ourself if a new computer paradigm, a new kind of information processing, is necessary. Massively parallel, faster, and less energy consuming, more similar to what we can find in nature and more similar to natural intelligence. Many tracks are explored such as physical, photonics, analog and quantum computers, some with neuromorphic capacities.

Artificial Neural Networks

Artificial neural networks are statistical learning models, inspired by biological neural networks, having the characteristic of being universal approximators.

In 1943, Warren McCulloch and Walter Pitts were the first to define a mathematical computation model similar to neural networks and put the neuron at the center of their model as the basic unit to process information in the brain. Pitts introduced the brain-as-universal-computer hypothesis and developed this idea with McCulloch in the article “A logical calculus of Ideas Immanent in Nervous Actvity”. Their work put the foundation of this field of study and showed that a network made of artificial neurons has the same computing power as the Turing Machine.

In 1958, Frank Rosenblatt created a linear classifier model based on supervised learning named perceptron which can find if a sample belongs to a class A or B. The perceptron bases its predictions on a linear function combining its inputs x_i with a set of weights w_i and returning 1 if this combinasion exceeds a threshold. More formally,

(1)   \begin{equation*} f(x) = \left\{   \begin{array}{l l}  1 & \sum_{i}{w_i \times x_i} + b > 0\\  0 & $otherwise$  \end{array} \right \end{equation*}

Perceptron model

Although promising, it was quickly demonstrated that this model is not powerful enough to recognize some classes. In 1969, Marvin Minsky and Seymour Paper raised two limits of the perceptron. First, its inability to compute the XOR function (see above figure). Secondly, the lake of computing power in the 60s required to train larger networks. This result, and failures of other AI methods, is the beginning of what researchers call the AI Winter which lasted several decades between the late 70s and the late 80s, where funding were scares. The connectist methods, also called Parallel Distributed Processing, were temporarily abandoned.

The advances in the field of neural networks slowed down into computers were powerful enough, and the discovery of back-propagation algorithm which was widely spead in mid 80s. This algorithm compute the gradient of an objective function with respect to the network’s parameter. The gradient is then used by the optimisation method to update the weights in order to minimize the error of the objective function.

Nils Schaetti is a doctoral researcher in Switzerland specialised in machine learning and artificial intelligence.

0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*