Issue with Kafka Connect (Debezium PostgreSQL) Replication Slots Becoming Inactive

sasidhar · 4 May 2025 11:30

Hello everyone,

I’m facing an issue while streaming data between different PostgreSQL databases using Kafka Connect with the Debezium PostgreSQL connector (version 7.5.0). The problem is related to replication slots becoming extended and eventually inactive, even though data streams successfully when the slots are in the extended state.

Environment Details:

Kafka Connect Version: 7.5.0
Connector Plugin: io.debezium.connector.postgresql.PostgresConnector
PostgreSQL Plugin: pgoutput
Database Configuration:
- Multiple PostgreSQL databases hosted on a single VM node
- Each source connector produces multiple Kafka topics

Observed Issue:

Replication slots transition to the extended state within a few hours.
After a couple of days, these slots turn inactive.
Despite this, data continues to stream while in the extended state.
I suspect this may be a result of inefficient offset management or WAL cleanup delays due to misconfiguration.

PostgreSQL Settings:

max_wal_size = 2GB
Using logical replication with pgoutput

Sample Connector Configuration:

{
“name”: “mdm_source_tables_connector”,
“config”: {
“connector.class”: “io.debezium.connector.postgresql.PostgresConnector”,
“database.hostname”: “xx.xx.xx.xxx”,
“database.port”: 5432,
“database.user”: “xxx”,
“database.password”: “xxxx”,
“database.dbname”: “mdmdb”,
“database.server.name”: “xx.xx.xx.xxx”,
“table.include.list”: “mdm.vas_master,mdm.product_master,…”,
“snapshot”: “incremental”,
“snapshot.mode”: “initial”,
“plugin.name”: “pgoutput”,
“slot.name”: “slot_mdm_source_tables_small_trg2”,
“publication.name”: “dbz_mdm_small”,
“publication.autocreate.mode”: “filtered”,
“topic.prefix”: “mdm”,
“database.history.kafka.bootstrap.servers”: “localhost:9092”,
“database.history.kafka.topic”: “schema-changes.mdm”,
“key.converter”: “io.confluent.connect.avro.AvroConverter”,
“key.converter.schema.registry.url”: “http://schema-registry:8081”,
“key.converter.enhanced.avro.schema.support”: true,
“value.converter”: “io.confluent.connect.avro.AvroConverter”,
“value.converter.schema.registry.url”: “http://schema-registry:8081”,
“value.converter.schemas.enable”: true,
“transforms”: “unwrap,removeFields”,
“transforms.unwrap.type”: “io.debezium.transforms.ExtractNewRecordState”,
“transforms.removeFields.type”: “org.apache.kafka.connect.transforms.ReplaceField$Value”,
“transforms.removeFields.blacklist”: “before,source”,
“include.schema.changes”: true,
“poll.interval.ms”: “1000”,
“offset.flush.interval.ms”: “5000”,
“heartbeat.interval.ms”: “5000”,
“max.batch.size”: “4096”,
“max.queue.size”: “16384”,
“slot.initialization.mode”: “minimal”,
“max.queue.size.in.bytes”: “10485760”,
“database.connection.timeout.ms”: “30000”
}
}

Starting the connector in always snapshot mode initially.
Switched to initial mode once topics start getting created.
Verified WAL size limits (max_wal_size = 2GB).
Monitoring shows the connector is in running state during extended state and is streaming data, but eventual after the slot goes to inactive state, the connector still shows as running but the data is not streaming to kafka topic.

Questions:

Why are replication slots becoming inactive even though streaming works for a while?
Is there any Debezium/Kafka Connect tuning required to avoid this state?
How can I ensure continuous streaming without hitting WAL overflow or slot inactivity?
Is the connector setup optimal for large table sets across multiple databases?

Any suggestions, best practices, or configuration improvements are greatly appreciated.

Thank you in advance!

mmuehlbeyer · 5 May 2025 14:30

hey @sasidhar welcome

did you check the database logs and activity?

what does the following sql shows you:


SELECT
  slot_name,
  pg_size_pretty(
    pg_wal_lsn_diff(
      pg_current_wal_lsn(), restart_lsn)) AS retained_wal,
  active,
  restart_lsn FROM pg_replication_slots;

best,
michael

SELECT pid AS process_id,
  usename AS username,
  datname AS database_name,
  client_addr AS client_address,
  application_name,
  backend_start,
  state,
  state_change
FROM pg_stat_activity
WHERE usename IS NOT NULL;

Topic		Replies	Views
Troubleshooting Debezium Connector: Intermittent Failures Self-Managed Connectors	1	477	12 November 2024
Our connector does repeated snapshots Kafka Connect	3	212	1 August 2024
POSTGRES TO REDIS Sychronisation through kafka and debezium Kafka Connect	2	2906	17 August 2022
Using Kafka connect to sync DB near real-time Kafka Connect	1	2571	22 October 2022
Debezium: The db history topic or its content is fully or partially missing. Please check database history topic configuration and re-execute the snapshot Kafka Connect	3	9694	19 May 2022