MSK Redshift connector

Hello, this is an extended guide on the Redshift MSK Connector in AWS, expanding on the instructions provided in the original guide.

The guide will consist of next chapters:

  1. Connector creation process
  2. Listing of all required permissions
  3. Errors you might encounter

1. Connector creation

Not to duplicate original article I would recommend you first to familiarize with confluent’s guide.

1.1 Plugin download

First, you are going to the plugin download page and downloading zip, which after should be located anywhere in the S3 bucket to make it be accessible inside AWS infrastructure.

Note, that you should locate there strictly zip, because on the connector creation process, AWS will require a path to the archive.

We assume, that you already have both working MSK cluster and Redshift instance, so you should go to connector creation page, and press there “Create connector” button.

If you are doing this first time and you do not have existing plugin instances, you need to select “Create custom plugin” in Custom plugin type and provide URI of a bucket, where you located the plugin in step 1.1. Mind, that you can get this uri by clicking “Copy S3 URI” with selected file

1.2 Connector properties

“Note that your connector can theoretically be created before meeting all conditions from the Security section. However, this will most likely result in a failed state. We assume that either you are lucky and no additional security configurations are needed, or all required steps have already been performed”

Also, it is required to mention, that currently there is no way to provide access to Redshift via IAM and well-grained role, so we are forced to provide access using account and its credentials.

aws.redshift.domain=< Required Configuration >
aws.redshift.port=< Required Configuration >
aws.redshift.database=< Required Configuration >
aws.redshift.user=< Required Configuration >
aws.redshift.password=< Required Configuration >

The entire list of properties can be accessed here, but minimal setup will look like

    "confluent.topic.bootstrap.servers": "localhost:9092",
    "connector.class": "io.confluent.connect.aws.redshift.RedshiftSinkConnector",
    "tasks.max": "1",
    "topics": "orders",
    "aws.redshift.domain": "cluster-name.cluster-id.region.redshift.amazonaws.com",
    "aws.redshift.port": "5439",
    "aws.redshift.database": "dev",
    "aws.redshift.user": "awsuser",
    "aws.redshift.password": "your-password",
    "auto.create": "true",
    "pk.mode": "kafka"

Mind, that you should decide what port will be used for communication with kafka brokers:

  • To communicate with brokers in plaintext, use port 9092.
  • To communicate with brokers with TLS encryption, use port 9094 for access from within AWS and port 9194 for public access.

That is why your confluent.topic.bootstrap.servers will consist of exact broker name, which can be taken from cluster Properties tab in Brokers section and port:

"confluent.topic.bootstrap.servers": "first.broker.name:9094, other.broker,name:9094",

This topic is temporarily closed for at least 1000000 hours due to a large number of community flags.