首页 >> 科学研究 >> 学术讲座 >> 正文

人工智能学科交叉讲座系列第【16】期：Solving the Vanishing/Exploding Gradients Problem via High-Dimensional Probability Theory

信息来源: 发布时间:2023-09-27 浏览量:

报告人：路遥

博士后

北京大学

主持人：林宙辰教授

北京大学智能学院、人工智能研究院

时间：2023/10/12 10:00 - 11:00

地址：北京大学昌平校区教学楼115教室 / 北京大学燕园校区理科二号楼2736

腾讯会议：440-684-733

报告题目：Solving the Vanishing/Exploding Gradients Problem via High-Dimensional Probability Theory

报告摘要:

The problem of vanishing and exploding gradients has been a long-standing obstacle that hinders the effective training of neural networks. Despite various tricks and techniques that have been employed to alleviate the problem in practice, there still lacks satisfactory theories or provable solutions. In this paper, we address the problem from the perspective of high-dimensional probability theory. We provide a rigorous result that shows, under mild conditions, how the vanishing/exploding gradients problem disappears with high probability if the neural networks have sufficient width. Our main idea is to constrain both forward and backward signal propagation in a nonlinear neural network through a new class of activation functions,namely Gaussian-Poincare normalized functions, and orthogonal weight matrices. Experiments on both synthetic and real-world data validate our theory and confirm its effectiveness on very deep neural networks when applied in practice.

报告人简介:

路遥现为北京大学心理与认知科学学院博士后，于2021年在澳大利亚国立大学获得计算机博士学位，研究方向是神经网络的学习算法。

上一页：人工智能学科交叉讲座系列第【17】期：On Efficient Training for Large-Scale Deep Learning Models

下一页：人工智能学科交叉讲座系列第【12】期：The Emergence of Property Concerns

首页

研究院概况

师资团队

科学研究

科研基地

新闻公告

人才培养

招贤纳士

联系我们

人工智能学科交叉讲座系列第【16】期：Solving the Vanishing/Exploding Gradients Problem via High-Dimensional Probability Theory

信息来源: 发布时间:2023-09-27 浏览量:

人工智能学科交叉讲座系列第【16】期：Solving the Vanishing/Exploding Gradients Problem via High-Dimensional Probability Theory

信息来源: 发布时间:2023-09-27 浏览量:_showDynClicks("wbnews", 1583922820, 2636)

信息来源: 发布时间:2023-09-27 浏览量: