报告题目:Lifted Proximal Operator Machines
报告时间:2020年5月23日(周六)上午9:00—9:45
报告人:林宙辰 教授
报告地点:Zoom云会议(ID:937 6354 7091, 密码:nanxinda60)
报告摘要:We propose a new optimization method for training feed-forward neural networks. By rewritingthe activation function as an equivalent proximal operator, we approximate a feed-forward neural network by adding the proximal operators to the objective function as penalties, hence we call the lifted proximal operator machine (LPOM). LPOM is block multi-convex in all layer-wise weights and activations. This allows us to use block coordinate descent to update the layer-wise weights and activations. Most notably, we only use the mapping of the activation function itself, rather than its derivative, thus avoiding the gradient vanishing or blow-up issues in gradient based training methods. So our method is applicable to various non-decreasing Lipschitz continuous activation functions, which can be saturating and non-differentiable. LPOM does not require more auxiliary variables than the layer-wise activations, thus using roughly the same amount of memory as stochastic gradient descent (SGD) does. Its parameter tuning is also much simpler. We further prove the convergence of updating the layer-wise weights and activations and propose an asynchronously parallel version of LPOM such that its speed is faster than SGD. Experiments on benchmark datasets testify to the advantages of LPOM.
欢迎广大师生踊跃参加!
数学与统计学院
2020年5月22日
附:专家简介
林宙辰,北京大学信息科学技术学院机器感知与智能教育部重点实验室教授。研究领域为:计算机视觉、图像处理、机器学习、模式识别和数值优化。他是CVPR 2014/16/19/20/21、ICCV2015、NeurIPS 2015/18/19/20、AAAI 2019/20、IJCAI 2020 和ICML 2020 的领域主席,也是IEEE Transactions on Pattern Analysis and Machine Intelligence和International J. Computer Vision的编委、IAPR 和IEEE会士、国家杰青。