Add visual presentation of metrics in Cluster Status widget

Description

There is new part in /cluster-status endpoint response, which may be worth to present in the UI.

It's inside each node, example:

Thanks to that's the meaning of each field:

  • last_check is a timestamp formatted as '%Y-%m-%dT%H:%M:%S.%fZ'

  • healthy is a bool result of comparison response from Prometheus with "1",

  • metric_type is a metric requested from Prometheus or "unknown" if we don't have such information,

  • metric_name is name of the exporter from which the metric comes from, always present

Activity

Show:
Jakub Niezgoda
October 16, 2020, 11:50 AM

FYI. Maybe worth to include some part of the newly added piece in cluster-status API to Cluster Status widget?

Ofer Yarom
October 18, 2020, 6:17 AM

Thanks for this.
I assume that the healthy state is what we are sharing as part of the services list in the system health. the rest of the items are very specific, and I believe that they are available as part of the raw data JSON that the user can get.
If both assumptions above are accurate, then at this point I don't think there is much value in extracting more data to the UI.

Jakub Niezgoda
October 20, 2020, 12:18 PM

I assume that the healthy state is what we are sharing as part of the services list in the system health.

Can you confirm that healthy field in metrics is mapped to status one level above (on the node level)?

Also - as metrics are provided as an array with objects I assume that we can have more than one metric there.

the rest of the items are very specific, and I believe that they are available as part of the raw data JSON that the user can get.

Just checked that and the answer is no. “Copy Raw Info” button provides an object with services details (content of services node), sth like:

metrics are defined on the same level as services so they are not included in the raw output.

Anyways, I’m OK with closing this one if you see no additional value in that data.

Mateusz Neumann
October 20, 2020, 12:40 PM

Can you confirm that healthy field in metrics is mapped to status one level above (on the node level)?

Yes it is. If any of the metrics is not healthy, the node is marked as Failed. https://github.com/cloudify-cosmo/cloudify-manager/blob/cf9d3fa449cfe605a66fe6f5c9aee898de388952/rest-service/manager_rest/cluster_status_manager.py#L409-L411.

Also - as metrics are provided as an array with objects I assume that we can have more than one metric there.

That is correct. Especially in case of managers we check for a number of endpoints, hence the number of metrics is about 5 or 6 (jobs http_200 and http_401).

Assignee

Unassigned

Reporter

Jakub Niezgoda

Labels

Target Version

future

QA Owner

None

Premium Only

no

Documentation Required

None

Why Blocked?

None

Release Notes

yes

Priority

None

Epic Link

Priority

Unprioritized
Configure