Finding training inefficiencies with CentML DeepView

MLOps World: Machine Learning in Production
MLOps World: Machine Learning in Production
36 بار بازدید - 4 ماه پیش - Speaker: Yubo Gao, Research Software
Speaker: Yubo Gao, Research Software Development Engineer at CentML Inc, and PhD student at University of Toronto, CentML Inc. Performance bottlenecks and resource underutilization is a common occurrence to deep learning researchers and developers. They slow down workflows of ML developers and waste computational resources. The current ecosystems of DL profilers do not provide a developer-friendly approach to comprehending DL model training performance or methods to decrease underutilization and enhance performance. In this presentation, we will showcase DeepView, an open source visual profiler developed by CentML specifically tailored to ML developers. DeepView provides intuitive and convenient performance visualizations and offers hints to ML developers to make their training jobs more efficient. Furthermore, DeepView optimizes deployment targets to meet both budget and time constraints through performance predictions. DeepView seamlessly integrates with PyTorch and Visual Studio Code, and we are actively working on expanding its support for other popular code editors. Through an interactive demo, we walk through optimizing the training of a real model with DeepView where we gain a manifold increase in training throughput.
4 ماه پیش در تاریخ 1403/02/27 منتشر شده است.
36 بـار بازدید شده
... بیشتر