Hi,
We are currently exploring Kafka Connect to read data from Kafka (specifically, the Azure Event Hub Kafka endpoint) and write it to Iceberg tables. I’m using Polaris for the REST catalog and Azure as the object store.
I created a catalog in Polaris using the following payload:
{
"catalog": {
"type": "INTERNAL",
"name": "catalogName",
"properties": {
"default-base-location": "abfss://container@storage-account.dfs.core.windows.net"
},
"storageConfigInfo": {
"storageType": "AZURE",
"tenantId": "xxxxxxxxxxxxx",
"allowedLocations": ["abfss://container@storage-account.dfs.core.windows.net/warehouse"]
}
}
}
Next, I created a namespace in Polaris and set up a sink connector in Kafka Connect with the following configuration:
{
"connector.class": "io.tabular.iceberg.connect.IcebergSinkConnector",
"tasks.max": "2",
"topics": "metric",
"iceberg.tables": "feed.replay-messages",
"iceberg.tables.auto-create-enabled": "true",
"iceberg.tables.schema-force-optional": "true",
"iceberg.catalog.type": "rest",
"iceberg.catalog.uri": "http://apache-polaris:8181/api/catalog",
"iceberg.catalog.io-impl": "org.apache.iceberg.azure.adlsv2.ADLSFileIO",
"iceberg.catalog.include-credentials": "true",
"iceberg.catalog.warehouse": "catalogName",
"iceberg.catalog.token": "xxxxxx",
"name": "sink-feed",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "false"
}
I also set the environment variables AZURE_CLIENT_ID
, AZURE_TENANT_ID
, and AZURE_CLIENT_SECRET
in both the Kafka Connect and Polaris pods.
The table gets created successfully, and I can see both the metadata and data folders. The metadata folder contains a JSON file, and the data folder has a few Parquet files.
However, when I attempt to read the table from Trino (after setting up the catalog with the correct configuration), I only see the column names, but no rows are returned. I’ve checked the Parquet files in Azure Blob Storage, and they do contain data.
Could someone help me resolve this issue?