ML Production Pipelines: A Classification Model

Databricks
Databricks
3.5 هزار بار بازدید - 4 سال پیش - In this talk, we will
In this talk, we will present how we tied Python together with Databricks and MLflow to productionalize a machine learning pipeline.

Through the deployment of a fairly standard classification model, we will present what a machine learning pipeline in Production could look like. The project consists of two pipelines; training and prediction. We are using the S3 Bucket as a source of data. The training pipeline trains various models on data, registers them in Mlflow, and stores all metrics and hyperparameters. Using Grid Search, the best model is chosen and moved to the Production Stage in MLflow. The Production model can then be deployed using Flask, or just a UDF if we want to process data in a batch. The prediction pipeline will then use the deployed model to make a prediction, whether on-demand or in a batch.

The whole project is packaged as a library, which can be installed anywhere, and the pipelines can easily be configured through configuration files.

About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie...

See all the previous Summit sessions:

Connect with us:
Website: https://databricks.com
Facebook: Facebook: databricksinc
Twitter: Twitter: databricks
LinkedIn: LinkedIn: databricks
Instagram: Instagram: databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...
4 سال پیش در تاریخ 1399/09/28 منتشر شده است.
3,539 بـار بازدید شده
... بیشتر