You are here: Home / Internal / Proposal writing discussions / Benchmarks / The danger of deep networks

The danger of deep networks

Up to Benchmarks

The danger of deep networks

Posted by Giacomo Indiveri at February 27. 2015

I start from the assumption that:

  1. our goal is to design a neuromorphic computing system that processes sensory signals (e.g. audio) for basic decision making (e.g., as the outcome of a recognition or classification task)
  2. we will attempt to minimize latency and power consumption
  3. we will not have access to additional external resources (e.g. "the cloud")
If those assumptions are true, than I believe it is very dangerous to compare our results to classical benchmarks, such as those used for machine vision or speech recognition. It might be useful to use them internally, to evaluate and quantify the performance of our system as we make progress with it. But it would make no sense to compare what a few mg of silicon that burn a few micro Watts of power can achieve, with what is computed by GPU farms, or server farms that weight hundreds of Kilograms and burn tens of mega Watts.
There is a need for new types of benchmarks. Think for example of a small drone, with stringent payload constraints, that has to navigate to execute a task, and at the same time dodge tennis balls that teenagers throw at it. That is the type of benchmark that would make sense to use. There I believe our technology can compete with Intel processors or Texas Instrument DSPs.
Along the same lines, I don't think we can compete with standard approaches based on deep-belief networks. The types of neural networks we can implement are much more sophisticated than deep-belief networks in terms of dynamics and non-linear operators they can use. Using the exact same methodology used on standard von Neumann machines to get some computation out of our silicon neural networks is likely to produce sub-optimal results. We should exploit the power of our recurrent neural networks (and so what if they are shallow!) and the dynamics that our computing elements can express.
This is a research project and not a competition with Bing, Facebook or Google. We should feel free to explore new and dramatically different architectures. Even a simple bee, with less neurons that the numbers of transistors we will put on our chips is an existence proof that spiking neural networks can excel at the benchmarks I mentioned above. The same is for crickets doing auditory localization, and many other examples of "simple" neural systems.
So in summary, I think we should use classical benchmarks internally, to quantify progress, but  we should avoid using them in the proposal deliverables. Note that also the Microsoft paper  on mobile computing applications in our paper repository is not using them...

Re: The danger of deep networks

Posted by Herbert Jaeger at February 27. 2015

I totally agree with Giacomo (although this spoils his intention to spur a hot discussion!). The current standard benchmarks in machine learning are run under quite different premises than what our project is aiming at, and we can't hope to excel in them at their face value. But I don't think we actually compete in the current Deep Learning community. Their application scope, while impressing, is actually quite limited: essentially it boils down to what machine learning people would call supervised regression tasks. But beyond those tasks there are entire classes of tasks and aspects of tasks that are important but currently a bit out of fashion in machine learning:

- online signal processing

- multiple-timescale memory functionalities

- "lifelong learning" capabilities, that is, extending a learnt model during exploitation when good new training data come along

- reactive systems that are tied in a sense-act loop (all kinds of robot applications, and human-computer-interfacing)

Task aspects that are unconventional in current mainstream ML but that we must address are robustness w.r.t. noise, low parameter precision, parameter drift (I don't know whether that's an issue), local hardware failures, variances across different copies of a chip. If we show that we can address such aspects then we are really innovative, not riding the mainstream. The proposal should contain some text explaining that we don't do deep learning in the first place (needs to be stated explicitly these days where almost everybody equates ML with deep learning). And we have to think of appropriate benchmarks of our own making (but should scan the literature whether there are benchmarks out there targetting noise robustness etc).

Re: The danger of deep networks

Posted by Bernabe Linares-Barranco at March 01. 2015

There is one thing I would like to clarify/comment on spiking nets, in relation to a sentence from Herbert on his email of Feb 21st: "spiking dynamics need to be integrated in time to yield such
analog values". Yes, there is a very obvious way to map from analog-valued continuous neurons to spiking neurons, which is integrating the spikes to obtain such analog value. But this is highly inefficient and yields an explosion of spikes: for example, if you use an 8-bit precision for the analog values you need to map 0 to 255 spikes in a given time interval for each neuron.

In order to exploit efficiently the advantage of spiking signal representation, it is useful to think that each neuron will be receiving spikes from a large receptive field of neurons (about 10000 in biology), and each neuron just needs to contribute usually one spike to signal the presence of a given "feature". Therefore, each neuron is receiving a collection of "features" from its receptive field in a given time window (usually a few mili seconds), and is tuned to detect a more compound "feature". One can think of this as a "coincidence detector". This way, spike encoding can be very efficient, and also can result in what I like to call the "pseudo-simultaneity" property, which can be very interesting for recurrent topologies where feedback is used.

Let me use the following figure to explain this.

 

This represents a feedforward 5-stage vision processing system. The top part is a classical frame-based system, where each stage can be implemented using analog-graded neurons. Each frame is processed stage after stage, requiring a frame-processing-time (here assumed as 1ms). Therefore, if the sensor also requires 1ms to acquire an image, one would obtain "recognition" at time 6ms.

In a spiking systems with a spiking retina (bottom part), the sensor is already providing spikes while things are happening in reality (well, with a typical delay per spike in the range of micro-seconds). The first layer is a collection of spiking feature detectors, typically oriented segments. So, as soon as a few retina pixels provide enough spatially aligned events (like 5 to 10 spikes representing an oriented edge), a neuron in the first layer detecting this oriented edge in a particular location will fire. The second layer would group oriented edges to form more complex shapes, and so on, layer after layer, until full object recognition. But each layer does not need to wait to process a full frame. Each neuron, independently, will signal the presence of its feature. This way, recognition is possible while the sensor is still providing spikes, and it is (theoretically) possible to adjust parameters so that each neuron fires just one spike. If you look at the bottom part of the figure (Event-based processing) all layers are operating almost simultaneously. This is very interesting for processing with feedback, as in recurrent systems.

Feedback processing in a frame-based vision system would require to send an image back to a previous stage, add/combine it with the feedforward flowing one, and iterate until convergence.

However, in a spiking system, events flow naturally as they occurr, and those flowing backwards are combined with those flowing forward naturally. One just needs to assure stability.

I hope you find this interesting and constructive.

Powered by Ploneboard