November 2020, Vol. 32, No. 11, Pages 2085-2121
© 2020 Massachusetts Institute of Technology. Published under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Reverse-Engineering Neural Networks to Characterize Their Cost Functions
Article PDF (683.78 KB)
This letter considers a class of biologically plausible cost functions for neural networks, where the same cost function is minimized by both neural activity and plasticity. We show that such cost functions can be cast as a variational bound on model evidence under an implicit generative model. Using generative models based on partially observed Markov decision processes (POMDP), we show that neural activity and plasticity perform Bayesian inference and learning, respectively, by maximizing model evidence. Using mathematical and numerical analyses, we establish the formal equivalence between neural network cost functions and variational free energy under some prior beliefs about latent states that generate inputs. These prior beliefs are determined by particular constants (e.g., thresholds) that define the cost function. This means that the Bayes optimal encoding of latent or hidden states is achieved when the network's implicit priors match the process that generates its inputs. This equivalence is potentially important because it suggests that any hyperparameter of a neural network can itself be optimized—by minimization with respect to variational free energy. Furthermore, it enables one to characterize a neural network formally, in terms of its prior beliefs.