Working PySpark with JSON file | How to work with JSON file using Spark | dr.dataspark

dr.dataspark
dr.dataspark
8.5 هزار بار بازدید - 2 سال پیش - Hey Folks,In this video, I
Hey Folks,

In this video, I have showcased how you work with JSON file using Spark. Most of the time you deal with the data that come as unstructured format and there is nothing you can do but to work with dictionary related data. As a data engineer, one should always have experience in dealing with structure, semi-structured and unstructured data.

I have demonstrated one such example on how to deal with data that comes as JSON format, in fact the data looks nested JSON, which is the usual issue in production when you are dealing with JSON data. They always cause issues while inferring schema.

Just to make sure larger audience understands better, I have taken more time explaining the background of spark, little bit on Hadoop, and different languages which you can use to code spark and why one should work on databricks building pipeline and cloud infrastructure. I have also conveyed the difference between on-premises and cloud architecture and also informed why modern day projects use data in file formats and not as a structured tables.

Do like, share and comment your thoughts, as it motivates me to contribute more to the community and create more such videos.

My handles are below:
Follow me on Instagram: Instagram: dr.dataspark
Connect me on LinkedIn: LinkedIn: bigdatamanoj

#bigdata #pyspark #json #dataspark
2 سال پیش در تاریخ 1401/01/15 منتشر شده است.
8,594 بـار بازدید شده
... بیشتر