Hello,
I think I encountered a bug when evaluating Google Cloud BigTable Sink Connector in a docker compose setup based on cp-all-in-one/cp-all-in-one-community/docker-compose.yml at dd2eb847f183e65b6a1cca4045649ba0d48cf51d · confluentinc/cp-all-in-one · GitHub.
In keys, the byte array fields are serialized improperly when serialized as a field of a Struct. From my looking at Kafka Connect source code, it looks like a too optimistic call of Struct#toString()
.
Connector config:
{
"name": "confluent_nothing",
"config": {
"connector.class": "io.confluent.connect.gcp.bigtable.BigtableSinkConnector",
"row.key.definition": "",
"table.name.format": "kafka_nothing",
"confluent.topic.bootstrap.servers": "kafka:29092",
"gcp.bigtable.credentials.path": "/gcp_key.json",
"tasks.max": "1",
"topics": "topic_confluent_nothing",
"gcp.bigtable.project.id": "unoperate-test",
"confluent.license": "",
"row.key.delimiter": "#",
"confluent.topic.replication.factor": "1",
"name": "confluent_nothing",
"gcp.bigtable.instance.id": "prawilny-dataflow",
"auto.create.tables": "true",
"auto.create.column.families": "true",
"insert.mode": "upsert"
},
"tasks": [
{
"connector": "confluent_nothing",
"task": 0
}
],
"type": "sink"
}
Key and value converters are org.apache.kafka.connect.json.JsonConverter
with schema enabled.
Key:
{
"schema": {
"type": "struct",
"fields": [
{
"type": "struct",
"fields": [
{
"type": "bytes",
"optional": false,
"field": "b"
}
],
"optional": false,
"field": "a"
}
],
"optional": false,
"name": "record"
},
"payload": {
"a": {
"b": "eEQK"
}
}
}
Value:
{
"schema": {
"type": "int64",
"optional": false,
"name": "record"
},
"payload": 1
}
cbt read
output:
----------------------------------------
Struct{b=[B@7c0d7049}
topic_confluent_nothing:KAFKA_VALUE @ 2025/01/17-16:17:14.768000
"\x00\x00\x00\x00\x00\x00\x00\x01"
I can provide a reproducer in form of docker-compose.yml
and related scripts if needed.