Big Data Engineer Mock Interview | Questions on Data Skewness | Salting | Out of Memory Error
7.7 هزار بار بازدید -
3 ماه پیش
-
𝐓𝐨 𝐞𝐧𝐡𝐚𝐧𝐜𝐞 𝐲𝐨𝐮𝐫 𝐜𝐚𝐫𝐞𝐞𝐫 𝐚𝐬
𝐓𝐨 𝐞𝐧𝐡𝐚𝐧𝐜𝐞 𝐲𝐨𝐮𝐫 𝐜𝐚𝐫𝐞𝐞𝐫 𝐚𝐬 𝐚 𝐂𝐥𝐨𝐮𝐝 𝐃𝐚𝐭𝐚 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫, 𝐂𝐡𝐞𝐜𝐤 trendytech.in/?src=youtube&sub=mockdec for curated courses developed by me.
𝐖𝐚𝐧𝐭 𝐭𝐨 𝐌𝐚𝐬𝐭𝐞𝐫 𝐒𝐐𝐋? 𝐋𝐞𝐚𝐫𝐧 𝐒𝐐𝐋 𝐭𝐡𝐞 𝐫𝐢𝐠𝐡𝐭 𝐰𝐚𝐲 𝐭𝐡𝐫𝐨𝐮𝐠𝐡 𝐭𝐡𝐞 𝐦𝐨𝐬𝐭 𝐬𝐨𝐮𝐠𝐡𝐭 𝐚𝐟𝐭𝐞𝐫 𝐜𝐨𝐮𝐫𝐬𝐞 - 𝐒𝐐𝐋 𝐂𝐡𝐚𝐦𝐩𝐢𝐨𝐧𝐬 𝐏𝐫𝐨𝐠𝐫𝐚𝐦!
"𝐀 8 𝐰𝐞𝐞𝐤 𝐏𝐫𝐨𝐠𝐫𝐚𝐦 𝐝𝐞𝐬𝐢𝐠𝐧𝐞𝐝 𝐭𝐨 𝐡𝐞𝐥𝐩 𝐲𝐨𝐮 𝐜𝐫𝐚𝐜𝐤 𝐭𝐡𝐞 𝐢𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰𝐬 𝐨𝐟 𝐭𝐨𝐩 𝐩𝐫𝐨𝐝𝐮𝐜𝐭 𝐛𝐚𝐬𝐞𝐝 𝐜𝐨𝐦𝐩𝐚𝐧𝐢𝐞𝐬 𝐛𝐲 𝐝𝐞𝐯𝐞𝐥𝐨𝐩𝐢𝐧𝐠 𝐚 𝐭𝐡𝐨𝐮𝐠𝐡𝐭 𝐩𝐫𝐨𝐜𝐞𝐬𝐬 𝐚𝐧𝐝 𝐚𝐧 𝐚𝐩𝐩𝐫𝐨𝐚𝐜𝐡 𝐭𝐨 𝐬𝐨𝐥𝐯𝐞 𝐚𝐧 𝐮𝐧𝐬𝐞𝐞𝐧 𝐏𝐫𝐨𝐛𝐥𝐞𝐦."
𝐇𝐞𝐫𝐞 𝐢𝐬 𝐡𝐨𝐰 𝐲𝐨𝐮 𝐜𝐚𝐧 𝐫𝐞𝐠𝐢𝐬𝐭𝐞𝐫 𝐟𝐨𝐫 𝐭𝐡𝐞 𝐏𝐫𝐨𝐠𝐫𝐚𝐦 -
𝐑𝐞𝐠𝐢𝐬𝐭𝐫𝐚𝐭𝐢𝐨𝐧 𝐋𝐢𝐧𝐤 (𝐂𝐨𝐮𝐫𝐬𝐞 𝐀𝐜𝐜𝐞𝐬𝐬 𝐟𝐫𝐨𝐦 𝐈𝐧𝐝𝐢𝐚) : rzp.io/l/SQLINR
𝐑𝐞𝐠𝐢𝐬𝐭𝐫𝐚𝐭𝐢𝐨𝐧 𝐋𝐢𝐧𝐤 (𝐂𝐨𝐮𝐫𝐬𝐞 𝐀𝐜𝐜𝐞𝐬𝐬 𝐟𝐫𝐨𝐦 𝐨𝐮𝐭𝐬𝐢𝐝𝐞 𝐈𝐧𝐝𝐢𝐚) : rzp.io/l/SQLUSD
I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.
BIG DATA INTERVIEW SERIES
This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development
Our highly experienced guest interviewer, Chandrali Sarkar, www.linkedin.com/in/chandrali-sarkar-4570a1102/ shares invaluable insights and practical guidance drawn from her extensive expertise in the Big Data Domain.
Our expert guest interviewee, Soumya Ranjan Parida, www.linkedin.com/in/soumya-parida/ has an interesting approach to answering the interview questions on Apache Spark, SQL and Azure Cloud Services.
Link of Free SQL & Python series developed by me are given below -
SQL Playlist - • SQL tutorial for everyone by Sumit Si...
Python Playlist - • Complete Python By Sumit Mittal Sir
Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!
Social Media Links :
LinkedIn - www.linkedin.com/in/bigdatabysumit/
Twitter - twitter.com/bigdatasumit
Instagram - www.instagram.com/bigdatabysumit/
Student Testimonials - trendytech.in/#testimonials
TIMESTAMPS : Questions Discussed
00:35 Introduction
01:40 Explain your project's end-to-end pipeline and overview.
03:17 What is the data source for your project?
03:36 Where does the data get ingested?
04:36 What types of data are being processed?
05:04 How do you capture incremental data in an OLTP environment?
07:52 What is the frequency and volume of the incoming data?
08:28 Which file formats have you worked with?
09:00 What is the predicate pushdown?
10:14 What optimizations have you applied in Spark?
10:45 Define broadcast join.
11:10 List some transformations you've used in Spark.
11:27 Explain narrow and wide transformations.
12:03 What is the difference between reduceByKey and groupByKey.
12:56 Have you encountered "out of memory" errors in Spark? How did you resolve them?
14:22 How will salting help in resolving out of memory error?
14:46 What is data skewness?
15:22 Explain cache and persist in Spark.
16:57 If memory and disk are full then in that case what will happen?
17:40 When would you use coalesce and repartition?
18:00 Provide a scenario where coalesce and repartition can be used?
18:38 Where does repartition happen at driver or executor level?
19:30 What is the difference between rank, dense rank, and row number functions?
22:06 Describe the internal process of submitting a Spark job.
Music track: Retro by Chill Pulse
Source: freetouse.com/music
Background Music for Video (Free)
Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs
3 ماه پیش
در تاریخ 1403/03/08 منتشر شده
است.
7,721
بـار بازدید شده