Gradient Boost Part 4 (of 4): Classification Details

StatQuest with Josh Starmer
StatQuest with Josh Starmer
124.2 هزار بار بازدید - 5 سال پیش - At last, part 4 in
At last, part 4 in our series of videos on Gradient Boost. This time we dive deep into the details of how it is used for classification, going through algorithm, and the math behind it, one step at a time. Specifically, we derive the loss function from the log(likelihood) of the data and we derive the functions used to calculate the output values from the leaves in each tree. This one is long, but well worth if you want to know how Gradient Boost works.

NOTE: There is a minor error at 7:01. It should just say log(p) - log(1-p) = log(p/(1-p)). And at 19:10 I forgot to put "L" in front of some of the loss functions. However, it should be clear what they are since I point to them say, "This is the loss function".

This StatQuest assumes that you have already watched Parts 1, 2 and 3 in this series:
Part 1, Regression Main Ideas: Gradient Boost Part 1 (of 4): Regress...
Part 2, Regression Details: Gradient Boost Part 2 (of 4): Regress...
Part 3, Classification Main Ideas: Gradient Boost Part 3 (of 4): Classif...

...and it also assumed that you understand odds, the log(odds) and Logistic Regression pretty well. Here are the links for...

The odds: Odds and Log(Odds), Clearly Explained!!!

A general overview of Logistic Regression: StatQuest: Logistic Regression
how to interpret the coefficients: Logistic Regression Details Pt1: Coef...
and how to estimate the coefficients: Logistic Regression Details Pt 2: Max...

Lastly, if you want to learn more about using different probability thresholds for classification, check out the StatQuest on ROC and AUC: THIS VIDEO HAS BEEN UPDATED SEE LINK ...

For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/

This StatQuest is based on the following sources:

A 1999 manuscript by Jerome Friedman that introduced Stochastic Gradient Boost: https://statweb.stanford.edu/~jhf/ftp...

The Wikipedia article on Gradient Boosting: https://en.wikipedia.org/wiki/Gradien...

The scikit-learn implementation of Gradient Boosting: https://scikit-learn.org/stable/modul...

If you'd like to support StatQuest, please consider...

Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - https://statquest.gumroad.com/l/wvtmc
Paperback - https://www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - https://www.amazon.com/dp/B09ZG79HXC

Patreon: Patreon: statquest
...or...
YouTube Membership: @statquest

...a cool StatQuest t-shirt or sweatshirt:
https://shop.spreadshirt.com/statques...

...buying one or two of my songs (or go large and get a whole album!)
https://joshuastarmer.bandcamp.com/

...or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
Twitter: joshuastarmer

Corrections:
6:58 log(p) - log(1-p) is not equal to log(p)/log(1-p) but equal to log(p/(1-p)). In other words, the result at 7:07,  log(p) - log(1-p) = log(odds), is correct, and thus, the error does not propagate beyond it's short, but embarrassing moment.
26:53, my indexing of the variables gets off. This is unfortunate, but you should still be able to follow the concepts.

#statquest #gradientboost
5 سال پیش در تاریخ 1398/02/02 منتشر شده است.
124,211 بـار بازدید شده
... بیشتر