About me
I am currently an IFDS postdoc scholar at the University of Washington, working with Dmitriy Drusvyatskiy and Maryam Fazel. I received my PhD in Computer Science from UCSD, where I was fortunate to be advised by Mikhail Belkin. Prior to UCSD, I received my BS in Mathematics from Zhejiang University.
I am broadly interested in the optimization and mathematical foundations of deep learning. I have worked on understanding the dynamics of wide neural networks, particularly in the NTK [Jacot et al. 2018] regime. Recently, I have been interested in the behaviors of neural networks when the learning rate is large, where certain phenomena such as catapult phase [Lewkowycz et al. 2020] and edge of stability [Cohen et al. 2021] occur. These phenomena do not occur in the NTK regime and seem to be related to the feature learning of neural networks.
Email: libinzhu at uw dot edu
Selected Papers
- N Mallinar, D Beaglehole, L Zhu, A Radhakrishnan, P Pandit, M Belkin, Emergence in non-neural models: grokking modular arithmetic via average gradient outer product. Pre-printed.
- L Zhu, C Liu, A Radhakrishnan, M Belkin, Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning. ICML 2024.
- L Zhu, C Liu, A Radhakrishnan, M Belkin, Quadratic models for understanding catapult dynamics of neural networks. ICLR 2024.
- L Zhu, C Liu, M Belkin, Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture. NeurIPS 2022.
- C Liu, L Zhu, M Belkin, On the linearity of large non-linear models: when and why the tangent kernel is constant. NeurIPS 2020 (Spotlight).
- C Liu, L Zhu, M Belkin, Loss landscapes and optimization in over-parameterized non-linear systems and neural networks. Applied and Computational Harmonic Analysis (ACHA) 2022.