Kafka Connect in Action: Loading a CSV file into Kafka

Robin Moffatt
Robin Moffatt
24.5 هزار بار بازدید - 4 سال پیش - Ingesting data from a CSV
Ingesting data from a CSV file into a Kafka topic can be done easily using Kafka Connect. Kafka Connect is part of Apache Kafka, and only requires a JSON file to configure - no coding!

This video shows you how to stream CSV files into Kafka, how to manage the schema, and as a bonus also shows streaming the same data out to a database such as Postgres, as well as processing it with ksqlDB.

To learn more about Kafka Connect see https://docs.confluent.io/current/con...

Code used in this video: https://github.com/confluentinc/demo-...
--

ℹ️ Table of contents

00:00:44 Brief introduction to Kafka Connect
00:01:39 Checking that the correct Kafka Connect plugin is installed
00:02:46 Kafka, Bytes, and Schemas
00:05:06 Creating a connector to ingest data from CSV file and set the schema based on header row
00:09:43 File metadata stored in the Kafka message header
00:10:18 Setting the message key
00:14:35 Manipulating the schema / changing data types
00:19:22 Ingesting raw CSV into a Kafka topic without a Schema
00:22:43 Streaming data from CSV into Kafka into a Database
00:28:27 Updating database rows in-place from CSV file ingest
00:28:42 Impromptu Kafka Connect troubleshooting ;-)
00:32:34 Filtering and aggregating CSV data in Kafka using ksqlDB

--

💾 Download Kafka Connect Spooldir plugin: https://www.confluent.io/hub/jcustenb...

🎈 Kafka Connect Spooldir docs: https://docs.confluent.io/current/con...
🎈 Single Message Transforms (SMT): https://docs.confluent.io/current/con...
🎈 Cast SMT: https://docs.confluent.io/current/con...
🎈 Kafka Connect JDBC Source connector: https://www.confluent.io/blog/kafka-c...
🎈 Kafka Connect JDBC Sink connector docs: https://docs.confluent.io/current/con...

🎥 Installing JDBC Driver for Kafka Connect: https://rmoff.dev/fix-jdbc-driver-video
🎥 Streaming data from Kafka to a database: https://rmoff.dev/kafka-jdbc-video
🎥 JDBC sink and schemas: https://rmoff.dev/jdbc-sink-schemas
🎥 Introduction to ksqlDB: https://rmoff.dev/ksqldb-introduction

--

☁️ Confluent Cloud ☁️
Confluent Cloud is a managed Apache Kafka and Confluent Platform service. It scales to zero and lets you get started with Apache Kafka at the click of a mouse. You can signup at https://confluent.cloud/signup?utm_so... and use code 60DEVADV for $60 towards your bill (small print: https://www.confluent.io/confluent-cl...)
4 سال پیش در تاریخ 1399/03/30 منتشر شده است.
24,506 بـار بازدید شده
... بیشتر