How to test your Python ETL pipelines | Data pipeline | Pytest

BI Insights Inc
BI Insights Inc
13.5 هزار بار بازدید - 2 سال پیش - In this tutorial we are
In this tutorial we are going to cover how to test ETL pipelines. I have received a number of inquiries on the testing and especially testing the data pipelines we build using python. Testing is an important aspect of ETL pipelines. It ensures we are delivering accurate information to our stakeholders. We want to make sure our data is current, consistent and accurate. Therefore, it is always a good idea to put test cases in place to catch data anomalies. A failing test can tell us that; • An assumption about your source data is incorrect. For example, a column we expected never to be null contains nulls or a column we expected to contain unique values contains duplicates. • Testing can catch the flaws in our transformation logic. Errata in the tests: One of the viewers pointed that the null check was always returning true. It has been revised to to return false when nulls are present. test_null_check function is updated as follow: def test_null_check(df): assert df['ProductKey'].notnull().all() Link to GitHub repo (code & data): github.com/hnawaz007/pythondataanalysis/tree/main/… Link to article on this topic: blog.devgenius.io/how-to-test-python-etl-pipelines… Pytest Docs: docs.pytest.org/en/7.2.x/ #pytest #etl #python Subscribe to our channel: youtube.com/c/HaqNawaz --------------------------------------------- Follow me on social media! Github: github.com/hnawaz007 Instagram: www.instagram.com/bi_insights_inc LinkedIn: www.linkedin.com/in/haq-nawaz/ --------------------------------------------- Topics covered in this video: 0:00 - Introduction to ETL testing 0:56 - Benefit of testing 1:32 - Pytest testing library overview 2:26 - Pytest setup 3:05 - Import Data 3:36 - First test - column check 6:08 - Primary key column tests 7:22 - Pytest features 8:15 - Data Type check 9:36 - Expected Values check
2 سال پیش در تاریخ 1401/10/11 منتشر شده است.
13,537 بـار بازدید شده
... بیشتر