How do I encode categorical features using scikit-learn?

Data School
Data School
137.7 هزار بار بازدید - 5 سال پیش - In order to include categorical
In order to include categorical features in your Machine Learning model, you have to encode them numerically using "dummy" or "one-hot" encoding. But how do you do this correctly using scikit-learn?

In this video, you'll learn how to use OneHotEncoder and ColumnTransformer to encode your categorical features and prepare your feature matrix in a single step. You'll also learn how to include this step within a Pipeline so that you can cross-validate your model and preprocessing steps simultaneously. Finally, you'll learn why you should use scikit-learn (rather than pandas) for preprocessing your dataset.

AGENDA:
0:00 Introduction
0:22 Why should you use a Pipeline?
2:30 Preview of the lesson
3:35 Loading and preparing a dataset
6:11 Cross-validating a simple model
10:00 Encoding categorical features with OneHotEncoder
15:01 Selecting columns for preprocessing with ColumnTransformer
19:00 Creating a two-step Pipeline
19:54 Cross-validating a Pipeline
21:44 Making predictions on new data
23:43 Recap of the lesson
24:50 Why should you use scikit-learn (rather than pandas) for preprocessing?

CODE FROM THIS VIDEO: https://github.com/justmarkham/scikit...

WANT TO JOIN MY NEXT LIVE WEBCAST? Become a member ($5/month):
Patreon: dataschool


=== RELATED RESOURCES ===

OneHotEncoder documentation: https://scikit-learn.org/stable/modul...
ColumnTransformer documentation: https://scikit-learn.org/stable/modul...
Pipeline documentation: https://scikit-learn.org/stable/modul...

My video on cross-validation: Selecting the best model in scikit-le...
My video on grid search: How to find the best model parameters...
My lesson notebook on StandardScaler: https://nbviewer.jupyter.org/github/j...


=== WANT TO GET BETTER AT MACHINE LEARNING? ===

1) WATCH my scikit-learn video series: Machine learning in Python with sciki...

2) SUBSCRIBE for more videos: https://www.youtube.com/dataschool?su...

3) ENROLL in my Machine Learning course: https://www.dataschool.io/learn/

4) LET'S CONNECT!
- Newsletter: https://www.dataschool.io/subscribe/
- Twitter: Twitter: justmarkham
- Facebook: Facebook: DataScienceSchool
- LinkedIn: LinkedIn: justmarkham
5 سال پیش در تاریخ 1398/08/21 منتشر شده است.
137,795 بـار بازدید شده
... بیشتر