机器学习前沿:在线学习和优化课程详细信息

课程号 04833340 学分 2
英文名称 Advanced Machine Learning: Online Learning and Optimization
先修课程
中文简介 设计依照环境自适应变化的自动系统,一直是计算机科学和工程希望能够实现的一个重要目标。在某些情形下,外在环境往往过于复杂,以致不能很好直接建模,这个时候最好的方法是采用一种鲁邦的方式:通过和环境的交互,来持续优化系统。作为机器学习的一个子领域,在线学习对此类问题提供了坚实的理论基础。这门课程将会主要讲授在线学习,包括基本的技巧和想法。我们同时会讨论它和机器学习其它领域的联系,应用,以及如何用此类技巧来设计高效大规模优化方法。
英文简介 Designing autonomous systems that can adapt to their environments is arguably one of the most important goals in computer science and engineering. In some cases the environment is too complex to be modeled, and the best is to take a robust approach: continuously optimize the system as it interacts with its environment. Online learning, a subfield of machine learning, provides the theoretical foundations to solve such problems. The course will provide an introduction to online learning, covering the basic techniques and ideas. We will also discuss its connections and applications to other areas of machine learning, as well as how the same techniques lead to efficient methods for large scale optimization.
开课院系 信息科学技术学院
通选课领域  
是否属于艺术与美育
平台课性质  
平台课类型  
授课语言 英文
教材 Introduction to Online Convex Optimization,E. Hazan,2016;
Online learning with predictable sequences,A. Rakhlin and K. Sridharan,Relax and randomize: from value to algorithms,A. Rakhlin, O. Shamir, and K. Sridharan,Prediction, Learning, and Games,N. Cesa-Bianchi and G. Lugosi,Cambridge University Press,2006,Online learning and online convex optimization,S. Shalev-Shwartz,2012,Adaptive subgradient methods for online learning and stochastic optimization,John Duchi, Elad Hazan, and Yoram Singer,Achieving all with no parameters: AdaNormalHedge,H. Luo and R. E. Schapire,A unified modular analysis of online and stochastic optimization: Adaptivity, optimism, non-convexity,P. Joulani, A. Gyorgy, and Cs. Szepesvari,MetaGrad: Multiple learning rates in online learning,T. van Erven, W. M. Koolen,Online learning,A. Gyorgy, D. Pal, Cs. Szepesvari,Coin Betting and Parameter-Free Online Learning,F. Orabona and D. Pal,Regret analysis of stochastic and nonstochastic multi-armed bandit problems,S. Bubeck and N. Cesa-Bianchi,A Proximal Stochastic Gradient Method with Progressive Variance Reduction,L. Xiao and T. Zhang,(Bandit) convex optimization with biased noisy gradient oracles,X. Hu, L. A. Prashanth, A. Gyorgy, and Cs. Szepesvari,On the complexity of best arm identification in multi-armed bandit models,E. Kaufmann, O. Cappe, and A. Garivier,Bandit based Monte-Carlo planning,L. Kocsis L, Cs. Szepesvari,
参考书
教学大纲 Online learning and optimization

Designing autonomous systems that can adapt to their environments is arguably one of the most important goals in computer science and engineering. In some cases the environment is too complex to be modeled, and the best is to take a robust approach: continuously optimize the system as it interacts with its environment. Online learning, a subfield of machine learning, provides the theoretical foundations to solve such problems. The course will provide an introduction to online learning, covering the basic techniques and ideas. We will also discuss its connections and applications to other areas of machine learning, as well as how the same techniques lead to efficient methods for large scale optimization.

Evaluation: final exam


Syllabus


1. Introduction
2. A warmup example

Expert framework
3. Mistake bounds for the zero-one loss
4. Continous predictions and convex losses
5. Randomized exponential weights algorithm
6. Lower bounds for prediction with expert advice
7. Follow the perturbed leader
8-9. Large expert classes with structure
10. Boosting

Online convex optimization
11. Continuous exponential weights
12. Follow the regularized leader
13. Mirror descent
14. Lower bounds
15. Linear classification
16. Linear least squares
17-18. Fast rates and adaptivity
19. Connection to statistical learning theory
20-21. Learnability, from value to algorithms

Partial monitoring
22. Multi-armed bandits
23. Lower bounds
24. Linear bandits
25. Bandit convex optimization
26. Optimization for machine learning and
variance reduction methods
27. Stochastic multi-armed bandits
28. Best arm identification
29. Monte-Carlo tree search & games
课堂讲授,文献阅读
课堂报告,考试
教学评估 王立威:
学年度学期:16-17-3,课程班:机器学习前沿:在线学习和优化1,课程推荐得分:4.06,教师推荐得分:4.17,课程得分分数段:85-90;