Optimizing models for machine learning|ADAM's Algorithm

Machine Learning and AI Academy
Machine Learning and AI Academy
8.8 هزار بار بازدید - 5 سال پیش - #MachineLearning
#MachineLearning #MachineLearningAlgorithms #DataScience #womeninmachinelearning #AI #empowerment #education #womeninAI #mathformachinelearning #math #deeplearning #deepneuralnetworks #convnets

We present the first lecture from the optimisation series detailing various aspects of optimisation functions (e.g., convex and non-convex types), in addition to demonstrating the story of ADAM and presenting the algorithm. The lecture then commences by broadly depicting a proof sketch of convergence to a stationary point, and presents initial steps needed. This is part I of a 2 (or 3) part series discussing ADAM and its variants. In the next, lecture, which will also be uploaded this week, we will have a review of what has been detailed here, finalise the proof, and attempt at an empirical implementation (we might reserve a separate video for that to ensure comparisons across ADAM's variants).

Main topics we cover in this lecture include:

1. Types of loss functions: Convex vs. Non-Convex
2. Brief Survey of types of points one would hope to converge to in the non-convex case
3. Brief Survey of the types of optimisation algorithms (i.e., zero, first, and second-order)
4. Demonstrating the story of ADAM by the following literature from its proposition to its improvements
5. Explaining how ADAM works from a 2018 NeurIPS paper of Zaheer e.t. al.
6. Commencing with the proof of the optimisation algorithm by depicting: 1) proof sketch and 2) L-smoothness bounds on the loss function

Of course, we will detail the remaining proof in the 2nd part and demonstrate implementations and empirical behaviour. We hope, however, that this gives a nice introduction to the field allowing us to build the knowledge needed to dig deeper.

Some Paper References on ADAM used in this lecture:
Adaptive Methods for Non-Convex Optimisation (Main Paper): https://papers.nips.cc/paper/8186-ada...
On the Convergence of ADAM and Beyond (Proof Correcting Paper): https://arxiv.org/pdf/1904.09237.pdf
ADAM: a Method for Stochastic Optimisation (Original ADAM approach): https://arxiv.org/abs/1412.6980

Other interesting Optimisation Books worth reading:
Convex Analysis and Optimisation (Book by Bertsekas and Colleagues): http://www.athenasc.com/convexity.html
Convex Optimisation (Book by Boyd and Vandenberghe): https://web.stanford.edu/~boyd/cvxbook/
We also follow the awesome Michael Jordan for amazing papers on optimisation (https://scholar.google.com/citations?.... Please check his team papers if interested, and have a look at references to other amazing researchers therein.

Also, we don't own most of the figs in the slides. Here are the links (other figs were either hand-made by rearranging or changing the style of those from the links below):
Conv. Net: https://leonardoaraujosantos.gitbooks...
RNN: https://towardsdatascience.com/unders...
Alpha-Go: https://dylandjian.github.io/alphago-...
Mujuco (Re-Arranged): https://github.com/DartEnv/dart-env/w...
Flying Mario: https://www.amazon.co.uk/CARRERA-RC-3...
Classification: https://helloacm.com/a-short-introduc...
Non-Negative Matrix Factorisation: Twitter: 1163211758821097472
Convex with Minima depiction: https://www.semanticscholar.org/paper...
GP Fig: https://www.researchgate.net/figure/I...
Paraboloid: https://en.wikipedia.org/wiki/Paraboloid
Dirichlet Function: http://mathworld.wolfram.com/Dirichle...
Non-Convex Function: https://plus.maths.org/content/convexity

We also thank the Reddit ML community who guided us in improving our slides and gave us valuable advice on topics to consider. Please make sure to join: Reddit: MachineLearning

Finally, we would like to say that we will split this channel in short videos for broader audiences and in-depth detailed ones. Thanks! Please make sure to like, share and subscribe (https://www.seevid.ir/fa/result?ytch=UC4lM.... See you soon!
5 سال پیش در تاریخ 1398/10/22 منتشر شده است.
8,834 بـار بازدید شده
... بیشتر