Home > Workload Solutions > Data Analytics > White Papers > Change Data Capture on Dell Data Lakehouse using Debezium > Solution Environment Setup
In our solution environment setup, we assume the following components are up and running seamlessly:
Configure Dell ECS ObjectStore as the primary storage cluster for the Dell Data Lakehouse, for that login into Dell Data Lakehouse system software UI, under storage configure Dell ECS.
Configure the Hive Catalog to storage Hive tables on the Dell ECS Object Storage of the Dell Data Lakehouse. For that, login into Dell Data Lakehouse system software UI, under Catalogs, Connect Catalog, Click +Add, Type è Hive and provide the configuration parameters.
Configure Iceberg Catalog to store Iceberg Open tables on the Dell ECS Object Storage of the Dell Data Lakehouse, for that login into Dell Data Lakehouse system software UI, under Catalogs, Connect Catalog, Click +Add, Type è Iceberg and provide the configuration parameters.
On the utility node marked and for the Debezium server to run:
git clone https://github.com/memiiso/debezium-server-iceberg.git
mvn -Passembly -Dmaven.test.skip package
Unzip debezium-server-iceberg-dist/target/debezium-server-iceberg-dist*.zip -d cdcapp
cd cdcapp
# Use iceberg sink
debezium.sink.type=iceberg
# iceberg sink config
debezium.sink.iceberg.table-prefix=debeziumcdc_
debezium.sink.iceberg.upsert=true
debezium.sink.iceberg.upsert-keep-deletes=true
debezium.sink.iceberg.write.format.default=parquet
debezium.sink.iceberg.catalog-name=iceberg
# Config with hive meatastore catalogs
debezium.sink.iceberg.type=hive
debezium.sink.iceberg.uri=thrift://DDAE_THRIFT_FQDN:9083
debezium.sink.iceberg.clients=5
debezium.sink.iceberg.warehouse=s3a://lakehouse/iceberg_warehouse
debezium.sink.iceberg.catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO
debezium.sink.iceberg.s3.access-key-id=MY_ACCESS_KEY
debezium.sink.iceberg.s3.secret-access-key=MY_SECRET_KEY
debezium.sink.iceberg.engine.hive.enabled=true
debezium.sink.iceberg.iceberg.engine.hive.enabled=true
debezium.sink.hive.metastore.sasl.enabled=false
debezium.sink.iceberg.hive.metastore.sasl.enabled=false
# enable event schemas - mandatory
debezium.format.value.schemas.enable=true
debezium.format.value=json
debezium.transforms=unwrap debezium.transforms.unwrap.type=io.debezium.transforms.ExtractNewRecordState debezium.transforms.unwrap.add.fields=op,table,source.ts_ms,db debezium.transforms.unwrap.delete.handling.mode=rewrite debezium.transforms.unwrap.drop.tombstones=true
# mysql source debezium.source.connector.class=io.debezium.connector.mysql.MySqlConnector debezium.source.offset.flush.interval.ms=0 debezium.source.database.hostname=utility_node debezium.source.database.port=3306 debezium.source.database.user=mysql debezium.source.database.password=mysql debezium.source.database.dbname=company debezium.source.database.server.name=mysql80 debezium.source.database.server.id=1234 debezium.source.schema.include.list=orders debezium.source.topic.prefix=dbz_
Login into the Mysql database and create company, orders tables and insert some records.
CREATE DATABASE company;
CREATE TABLE orders (
order_id INT AUTO_INCREMENT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
total_amount DECIMAL(10, 2) NOT NULL
);
INSERT INTO orders (customer_id, order_date, total_amount) VALUES (1, '2023-01-15', 150.50);
INSERT INTO orders (customer_id, order_date, total_amount) VALUES (2, '2023-02-20', 200.00);
INSERT INTO orders (customer_id, order_date, total_amount) VALUES (3, '2023-03-25', 320.75);
INSERT INTO orders (customer_id, order_date, total_amount) VALUES (4, '2023-04-10', 450.00);
INSERT INTO orders (customer_id, order_date, total_amount) VALUES (5, '2023-05-05', 500.25);