Good day.
Im new to the confluent cloud and scraping the metrics with the Prometheus config, which lists each cluster, connector, or ksql db by the ID. If any of those objects are removed for some reason. The scrape will break with a 403 unauthorized error as that specified object no longer exists.
Is there a way to scrape at an ORG level and not be dependent on listing each clusterID.
or if that ID is not available, continue without failing
Thanks
Trevor
hey @trevorwebster
welcome
could you elaborate a bit more?
who did you configure prometheus?
I guess you are using the metrics api?
best,
michael
This is correct, using the metrics api with the generated config
scrape_configs:
- job_name: Confluent Cloud
scrape_interval: 1m
scrape_timeout: 1m
honor_timestamps: true
static_configs:
- targets:
- api.telemetry.confluent.cloud
scheme: https
basic_auth:
username:
password:
metrics_path: /v2/metrics/cloud/export
params:
“resource.kafka.id”:
- lkc-r
- lkc-y
- lkc-d
- lkc-3
- lkc-6
thanks for the input.
need to check by myself whether it’s possible.
will come back once tested
same error here
afaik there is no possibility to automate this out of the box
one thing which might work is to get the cluster ids via confluent cli and create the prometheus.yml
by script or similar.
will try this tomorrow
best,
michael
Hi Trevor,
I am facing the same issue.
If one “param” is down, everything is down
And the rate limiting makes it even more complicated. I mean, one could always create multiple scrape jobs ( which would not make much sense) but we would just receive a 429. too many requests.
Like you i am yet to come up with a solution, so i am very eager to hear about your findings.
I was hoping for something like a file based sd, but i cant come up with a way to provide the “resource.kafka.id” part in a flie_sd…
Will follow this thread
Best regards
Oelsner