When a child is asked to sort out and tidy up his toys, he finds it an easy task because he knows how to tidy up and also where everything belongs. After all, he has done the same thing hundreds of times. The child’s parents now ask him to help them with a new task in future – to tidy up the kitchen and sort out the rubbish. Despite this being a new task, the child learns quickly because he already understands the concept of sorting things out. He only has to work out where everything should go. A robot whose “brain” is a neural network can also learn to tidy up toys. If you then ask the robot to sort the rubbish, however, it has to start all over again from scratch.
Neural networks are designed to behave like the brain. They learn like a human, who learns by training so-called synapses – in the network these are actually parameters – and these synapses are reinforced every time a training session is completed successfully. The stronger the synapses become, the more reliably the network functions.
Yet these learnt skills are forgotten again when faced with a new task. “Although deep learning networks are particularly good at learning to complete specific tasks, managing to extract the underlying structures used to find the solution and transfer them to other tasks has remained an open area of research until now”, explains Daniel Tanneberg, a doctoral candidate in the Intelligent Autonomous Systems (IAS) Group at TU Darmstadt. He and his colleagues have now developed a neural computer architecture that is designed to do exactly this. They describe their system – the Neural Harvard Computer (NHC) – in a paper published in the journal Nature Machine Intelligence.
Transfering strategies for one problem to another
“We know from learning research that the ability to transfer strategies used for one problem to another is a key feature of intelligent behaviour”, says Tanneberg. Therefore, a neural network must be designed to be as generalised as possible so that it can handle not only different data but also different tasks. It is only then that it actually behaves intelligently.
The NHC has a memory-augmented, network-based architecture. The researchers have augmented a traditional neural network with multiplemodules, such as external memory, and this makes it possible to introduce another level of abstraction. The information flows are split and the algorithmic operations are decoupled from the data manipulations. The network separates what it has learnt about specific data from the general strategies it has learnt. For a robot, one level would be differentiating between the toys and another sorting them. Until now, both of these levels were encoded into the synaptic weightings in neural networks and were thus not separated.
Using a set of eleven tasks, the researchers have shown that the NHC can reliably learn algorithmic solutions with strong generalisation and transfer them to any task configuration. “This has the huge advantage that the network is able to master new tasks more quickly because only the dataspecific operations need to be adapted”, says Tanneberg. This saves resources because neural networks often require a large amount of computing power and a lot of days of training.
Another advantage is that it is easier to understand these types of networks because they offer greater insights into the learning process and their behaviour after learning. The new architecture opens up the possibility to discover new and unexpected strategies that the network adapts. Ultimately, it wouldn’t do us humans any harm to learn to behave even more intelligently.