XGBoost: Complete classification steps with Python| data analysis | Supervised Learning | Titanic

RVStats Consulting
RVStats Consulting
1.5 هزار بار بازدید - 4 سال پیش - In this amazing episode, we'll
In this amazing episode, we'll cover step by step a complete machine learning analysis for classification through the extreme gradient boosting classifierusing the TITANIC DATA SET with python JUPYTER NOTEBOOK. Pandas libraries for data manipulation, matplotlib for creation of graphics, sklearn for calling performances functions and XGBoost for the classifier.

- The data: where and what is the dataset
- Exploratory analysis? Visualization?
- Feature selection: Choosing the variables to use
- Impute and missing values
- Encoding variables: Dummy variables, categorical and nominal features
- Dropping variables
- Split train and test sets
- Decision trees and Boosting
- Hyperparameters and parameters: Learning rate, max depth, gpu_id,
  number of estimators
- Building and fitting the model
- Interpretating Variable importance
- Prediction for new values
- Performance measurement using accuracy
- What is confusion matrix, false positive and negative
- Overfitting and underfitting
- How to improve performance?? Feature engineering?

Regression with XGB:  XGBoost: Regression step by step with...

Clustering in python
V-1 Hierarchical clustering with Pyth...

clustering in R
Hierarchical Clustering | Agrupamient...

Any comments or suggestions are welcome.

Contact: [email protected]

Mi canal de estadistica en español
@rvstats_es

Machine learning
Easy data science
Supervised and unsupervised learning
statistical analysis
surviving analysis
Factors
Independent and dependent
input and output

Python for beginners
Python from zero
Python basics
Python data analysis
4 سال پیش در تاریخ 1399/09/28 منتشر شده است.
1,551 بـار بازدید شده
... بیشتر