Forward-Forward algorithm by Geoffrey Hinton

I would like to initiate a discussion on the recent publication by Geoffrey Hinton proposing an alternative to the traditional backpropagation algorithm - The Forward-Forward Algorithm: Some Preliminary Investigations and the paper by Alexander Ororbia and Ankur Mali - The Predictive Forward-Forward Algorithm which suggests incorporating a generative circuit into the original FF network.

I am interested in hearing the thoughts and insights of the community on these papers. I am particularly interested in discussing the potential benefits of layer level weights update in the Forward-Forward algorithm as it could potentially allow for training a network layer by layer without the need for a huge amount of VRAM.

3 Likes

Implementations found so far:
Tensorflow Implementation
PyTorch Implementation

Another one:
Tensorflow Implementation

I am attempting to build a mini-GPT version using the Forward-forward idea.

I cant find much of anything using it in generative language models, or any example of the NLP benchmark referenced in the Hinton paper.

if anyone has any thoughts or repos to provide that type of Implementing of the Forward-Forward Algorithm it would be very helpful.

best so far is a few not working repos:

The implementation of the predictive forward-forward algorithm has been released publicly:
https://github.com/ago109/predictive-forward-forward

Hi,
has anyone tried to train the famous deep spiking neural networks using forward-forward ?

Hello,
Yes, there was work that came out about a month or so ago that proposed a generalization of forward-forward (and predictive forward-forward) for (deep) spiking networks - this was called the event-driven forward-forward algorithm (as they had to craft a formulation that worked with spikes themselves):
https://arxiv.org/abs/2303.18187

An implementation which is more native to pytorch

I think the idea of high layer-activations only for the positive data, interesting. The network essentially isnâ€™t giving an Output like in backpropagation, but itâ€™s now the Property of the network to â€ślight upâ€ť for correct labels, and therefore indicating whether itâ€™s a positive data or not. I enjoyed this interview given by Hinton about his paper.

Find my notebook implementation based on the work of Mohammad Pezeshki. Itâ€™s modular so you can experiment with different candidates for goodness functions, layerwise loss functions and negative data generation.

I am finding it difficult to implement FF algorithm to convnets. I suspect that it might be due to the label information overlayed on the input getting diffused so much. Could someone guide me on this? My attempt is uploaded to my repo in the previous response. Thanks!