Session 1: Introduction to Deep Learning
Speakers:
Session 2: Language Modeling and Inverse Problems
Speakers:
How can we model language? From word vectors to RNN to LSTM. What is the fitting ability of RNN? What are its limitations? What motivated the creation of LSTM and GRU? PyTorch implementation of RNN, GRU, and LSTM. What else can be improved?
Introduction to Implicit Neural Representation (INR) methods and applications, basic concepts of Physics-informed neural networks, and low-rank tensor function decomposition. Finally, we will explore the application of INR in inverse problems.
Session 3: Computer Vision Fundamentals
Speakers:
Introduction to the basics of Convolutional Neural Networks (CNN) for beginners, including CNN's hierarchical structure, basic concepts, C++ and Python implementation of convolution operations, and analysis of classic models LeNet and AlexNet. If time permits, we will also discuss how to design lightweight CNN architectures. The seminar will be conducted in an intuitive and easy-to-understand manner, avoiding complex theories, with more focus on animations and examples to facilitate understanding, suitable for beginners interested in CNN and developers who want to quickly learn the basics of CNN.
We will start from experimental phenomena to explore the problems that occur when networks deepen, and naturally introduce a new network structure - residual connections. Then we will provide several different perspectives to understand why "residual connections" can succeed, why it is said that it has made today's deep learning flourish, including forward information flow, backward information flow, dynamical systems, and other perspectives.
Session 4: Transformers and Beyond
Speakers:
We will deeply analyze the Transformer model, from the basic Attention mechanism to its application in the CV field - Vision Transformer (ViT). Discuss how to accelerate computation through Linear Attention, and analyze the role of Softmax. Finally, we will introduce the emerging Mamba model, how it performs on related tasks, is it Mambaout or leading a new trend?
Session 5: Neural Network Initialization
Speakers:
How does the size of neural network weight initialization affect the behavior of neural networks? In this session, we will share the research results of Professor Zhiqin Xu's team in this area. "Phase diagram analysis" shows the dynamic characteristics of neural networks under different conditions during training; "Condensation phenomenon" discusses the aggregation effect of neuron weights during the training process, which helps improve generalization capabilities; "Loss landscape embedding principle" explains how wide networks can contain critical points of narrow networks, guiding the model to converge to solutions of low complexity. These contents together deepen the understanding of neural network training and generalization mechanisms.
Session 6: Understanding Diffusion Models
Speakers:
This sharing will deeply explore and derive the mathematical principles of diffusion models from multiple perspectives, focusing on analyzing how DDPM (Denoising Diffusion Probabilistic Models), Score-Matching, Stochastic Differential Equations (SDE), Ordinary Differential Equations (ODE), and other methods help us understand the intrinsic mechanisms of the diffusion process. At the end of the sharing, we will explore physical equations related to diffusion models, especially the Fokker-Planck equation and Langevin equation, analyzing how they describe the diffusion process and its relationship with thermodynamic systems in physics. Through these equations, we will further deepen our understanding of the theoretical foundation of the diffusion process. We will also discuss the latest research directions of diffusion models, as well as questions about diffusion.
Session 7: PAC Learning Framework
Speakers:
A comprehensive introduction to the PAC learning framework and its extensions, starting from basic concepts such as generalization error, empirical error, and PAC learnability. Through specific examples, it analyzes sample complexity and algorithm performance, and discusses Bayesian error rate, noise, and the decomposition of estimation error and approximation error in agnostic PAC learning. It further introduces optimization strategies such as structural risk minimization and regularization methods, aiming to provide systematic guidance for understanding the theoretical basis and practical application of learning algorithms.