26. Time Travel/Versioning in Delta Table

CloudFitness
CloudFitness
8.9 هزار بار بازدید - 2 سال پیش - Follow me on Linkedin
Follow me on Linkedin
LinkedIn: bhawna-bedi-540398102

Instagram
https://www.instagram.com/bedi_foreve...
What is Delta Lake?

Delta Lake is an open source storage layer that brings reliability to data lakes.

Delta Lake is a data format based on Apache Parquet. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing.
Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs.

Delta features overview

Delta Lake is an open-source storage layer that brings reliability to data lakes.
ACID transactions on Spark –  Delta is not RDBMS right away.
Scalable metadata handling
Streaming and batch unification - A table in Delta Lake is a batch table as well as a streaming source and sink.
Schema enforcement - Automatically handles schema variations to prevent insertion of bad records during ingestion
Time travel - Data versioning enables rollbacks and full (or rather up to your retention settings) historical audit trails.
Upsert and deletes -  Supports merge, update and delete operations to enable complex use cases like change-data-capture, slowly-changing-dimension (SCD) operations, streaming upsert, and so on.

Data-bricks hands on tutorials
Databricks hands on tutorial(Pyspark/...

Azure Event Hubs
Azure Event Hubs

Azure Data Factory Interview Question
Azure Data Factory Interview Questions

SQL leet code Questions
SQL Interview Questions(LeetCode/Hack...

Azure Synapse tutorials
Azure Synapse Analytics Hands-on Tuto...

Azure Event Grid
Event Grid

Azure Data factory CI-CD
CI-CD in Azure Data Factory

Azure Basics
Azure Basics

Data Bricks interview questions
DataBricks Interview Questions
2 سال پیش در تاریخ 1400/12/21 منتشر شده است.
8,998 بـار بازدید شده
... بیشتر