How to build and automate a ETL pipeline with AWS airflow | AWS End-To-End Data Engineering Project

Data Tech
Data Tech
11.9 هزار بار بازدید - 12 ماه پیش - In this data engineering project,we're
In this data engineering project,we're creating a data pipeline on Amazon Web Services (AWS) using airflow, python, spark, Glue, Redshift and other AWS services. We will learn how to build and automate an ETL process that that can extract the weather data from open weather map API, transform the data using spark  and load the data into  Redshift using Apache Airflow.

Here, the necessary infrastructure is set up using code-based deployment.

Since this project involves practical work, I strongly recommend that you initially watch the entire video without trying to do the steps alongside. This way, you'll grasp the concepts and processes more effectively. After that, you have a couple of options:
Independent Attempt: You can try to recreate the example I demonstrated without referring to the video. If you encounter difficulties, you can consult the video to help you overcome the challenges.
Guided Follow-Along: Alternatively, you can watch the video again, this time following along with the steps. This will reinforce your understanding as you see the actions being performed while you do them yourself.

Remember, the goal is to comprehend the project thoroughly while also gaining hands-on experience. The choice between trying on your own or following along with the video again depends on your learning style and comfort level.

Project Github Link - https://github.com/AnandDedha/aws-air...

00:00 Introduction
01:05 Understand the Project Architecture & Prerequisites
04:00 Open weather API
09:17 Deployment of the Infrastructure using code
15:30 Establish variables, create connections, and transfer DAGs to the DAGs directory
32:07 Understand the project code


AWS services explained Videos link -:

Amazon S3 (Simple Storage Service):  AWS S3 Tutorial (Part1) - Introductio...
                                                                   
AWS Glue:  AWS Glue tutorial for beginners| AWS ...
                   Learn how to perform ETL & Cataloging...

Amazon Redshift: Amazon Redshift - A Beginner's Guide ...

#aws
#awsdataengineer
#awsdataanalytics
#awsbigdata
#AWSDataEngineering
#awstraining
#awscloudpractitioner
#awsclouddataengineer
#awscloudformation
#pyspark
#awsglue
#awss3
#redshift
#airflow


AWS Big Data
AWS Data Engineer
AWS Data Analytics
AWS
AWS Data Engineering
Data Engineering Architecture
AWS Glue
AWS Redshift
AWS S3
AWS Datapipeline
AWS Airflow

Email: [email protected]
12 ماه پیش در تاریخ 1402/05/24 منتشر شده است.
11,942 بـار بازدید شده
... بیشتر