We utilize a connection between compositional kernels and branching processes via Mehler’s formula to study deep neural networks. This new probabilistic insight provides us a novel perspective on the mathematical role of activation functions in compositional neural networks. We study the unscaled and rescaled limits of the compositional kernels and explore the different phases of the limiting behavior, as the compositional depth increases. We investigate the memorization capacity of the compositional kernels and neural networks by characterizing the interplay among compositional depth, sample size, dimensionality, and non-linearity of the activation. Explicit formulas on the eigenvalues of the compositional kernel are provided, which quantify the complexity of the corresponding reproducing kernel Hilbert space. On the methodological front, we propose a new random features algorithm, which compresses the compositional layers by devising a new activation function.

More on this topic

BFI Working Paper·Jan 28, 2025

Drive Down the Cost: Learning by Doing and Government Policies in the Global EV Battery Industry

Panle Jia Barwick, Hyuk-soo Kwon, Shanjun Li Nahim, and Bin Zahur
Topics: Energy & Environment, Technology & Innovation
BFI Working Paper·Dec 10, 2024

Learning Fundamentals from Text

Alex G. Kim, Maximilian Muhn, Valeri Nikolaev, and Yijing Zhang
Topics: Technology & Innovation
BFI Working Paper·Oct 7, 2024

12 Best Practices for Leveraging Generative AI in Experimental Research

Samuel Chang, Andrew Kennedy, Aaron Leonard, and John List
Topics: Technology & Innovation