Deep Learning Seminar Details

Session 1: Introduction to Deep Learning

October 13, 2024

9:30 AM - 11:30 AM

Xi'an Jiaotong University

Speakers:

Jiaxuan Zou

Introduction to Deep Learning — From Zero to Multilayer Neural Networks

Yunfeng Liu

Common Optimization Algorithms

Ruihua Chen

Research Work and Achievements

Session 2: Language Modeling and Inverse Problems

October 20, 2024

9:30 AM - 11:30 AM

Xi'an Jiaotong University, Building C-206

Speakers:

Jiaxuan Zou

How We Model Language: From Word Embeddings to RNN to LSTM

How can we model language? From word vectors to RNN to LSTM. What is the fitting ability of RNN? What are its limitations? What motivated the creation of LSTM and GRU? PyTorch implementation of RNN, GRU, and LSTM. What else can be improved?

Ruihua Chen

Implicit Neural Representation Methods and Applications

Introduction to Implicit Neural Representation (INR) methods and applications, basic concepts of Physics-informed neural networks, and low-rank tensor function decomposition. Finally, we will explore the application of INR in inverse problems.

Session 3: Computer Vision Fundamentals

October 27, 2024

9:30 AM - 11:30 AM

Xi'an Jiaotong University, Building A-204

Speakers:

Yunfeng Liu

Introduction to Computer Vision

Introduction to the basics of Convolutional Neural Networks (CNN) for beginners, including CNN's hierarchical structure, basic concepts, C++ and Python implementation of convolution operations, and analysis of classic models LeNet and AlexNet. If time permits, we will also discuss how to design lightweight CNN architectures. The seminar will be conducted in an intuitive and easy-to-understand manner, avoiding complex theories, with more focus on animations and examples to facilitate understanding, suitable for beginners interested in CNN and developers who want to quickly learn the basics of CNN.

Jiaxuan Zou

What Makes Deep Learning Truly "Deep" — Introduction and Understanding of ResNet

We will start from experimental phenomena to explore the problems that occur when networks deepen, and naturally introduce a new network structure - residual connections. Then we will provide several different perspectives to understand why "residual connections" can succeed, why it is said that it has made today's deep learning flourish, including forward information flow, backward information flow, dynamical systems, and other perspectives.

Session 4: Transformers and Beyond

November 17, 2024

9:30 AM - 11:30 AM

Xi'an Jiaotong University, Building B-102

Speakers:

Yining Li

Transformer Models and Their Applications

We will deeply analyze the Transformer model, from the basic Attention mechanism to its application in the CV field - Vision Transformer (ViT). Discuss how to accelerate computation through Linear Attention, and analyze the role of Softmax. Finally, we will introduce the emerging Mamba model, how it performs on related tasks, is it Mambaout or leading a new trend?

Session 5: Neural Network Initialization

November 24, 2024

10:00 AM - 12:00 PM

Online

Speakers:

Jiaxuan Zou

The Impact of Neural Network Initialization — Phase Diagram Analysis, Parameter Condensation, and Loss Landscape Embedding

How does the size of neural network weight initialization affect the behavior of neural networks? In this session, we will share the research results of Professor Zhiqin Xu's team in this area. "Phase diagram analysis" shows the dynamic characteristics of neural networks under different conditions during training; "Condensation phenomenon" discusses the aggregation effect of neuron weights during the training process, which helps improve generalization capabilities; "Loss landscape embedding principle" explains how wide networks can contain critical points of narrow networks, guiding the model to converge to solutions of low complexity. These contents together deepen the understanding of neural network training and generalization mechanisms.

Session 6: Understanding Diffusion Models

December 1, 2024

9:30 AM - 11:30 AM

Online

Speakers:

Jingsong Sun

Understanding Diffusion Models

This sharing will deeply explore and derive the mathematical principles of diffusion models from multiple perspectives, focusing on analyzing how DDPM (Denoising Diffusion Probabilistic Models), Score-Matching, Stochastic Differential Equations (SDE), Ordinary Differential Equations (ODE), and other methods help us understand the intrinsic mechanisms of the diffusion process. At the end of the sharing, we will explore physical equations related to diffusion models, especially the Fokker-Planck equation and Langevin equation, analyzing how they describe the diffusion process and its relationship with thermodynamic systems in physics. Through these equations, we will further deepen our understanding of the theoretical foundation of the diffusion process. We will also discuss the latest research directions of diffusion models, as well as questions about diffusion.

Session 7: PAC Learning Framework

May 1, 2025

Online

Speakers:

Jiaxuan Zou

PAC Learning Framework

A comprehensive introduction to the PAC learning framework and its extensions, starting from basic concepts such as generalization error, empirical error, and PAC learnability. Through specific examples, it analyzes sample complexity and algorithm performance, and discusses Bayesian error rate, noise, and the decomposition of estimation error and approximation error in agnostic PAC learning. It further introduces optimization strategies such as structural risk minimization and regularization methods, aiming to provide systematic guidance for understanding the theoretical basis and practical application of learning algorithms.