How to merge this query

Can someone help with promql?

Query A: k8s_pod_phase{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”} ==2

Query B: sum(k8s_container_cpu_request{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”})

Query C: sum(k8s_node_allocatable_cpu{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”})

CPU reservation % = B/C * 100

but Query B is pulling data for all pods i,e Running, Pending, or Failed! Due to this reason, the calculation goes more than 100%.
Query A value 2 is only for Running pods.

How can I add Query A and Query to get k8s_container_cpu_request only for Running pods?

You can use the and operator. The and operator (along with the on() modifier) can be used to filter out labelsets. A and B will return the vector with the elements of A that have labelsets matching in B.

In your case it would be something like the following if k8s_container_cpu_request and k8s_pod_phase have the same labelset:

sum(k8s_container_cpu_request{k8s_cluster_name="$cluster_name", k8s_node_name=~".vm."} and (k8s_pod_phase{k8s_cluster_name="$cluster_name", k8s_node_name=~".vm."} == 2))
sum(k8s_node_allocatable_cpu{k8s_cluster_name="$cluster_name", k8s_node_name=~".vm."})

if you want to filter by a specific label you can do and on (label_x)

PS: if you are using grafana variables you should use the operator =~

I tried with the above query NO Data coming!

Did you check whether the individual queries return results by themselves?

From what I can see there are potential issues in:

  • Filtering by cluster name. You seem to be using a variable but using a = instead of a =~.
  • The node filter regex looks weird.

If the 3 separate queries (k8s_container_cpu_request, k8s_pod_phase, k8s_node_allocatable_cpu) return values on their own. Then you need to make sure that the label set between the results returned by the two queries you want to and are the same. If not you need to add a and on (<common label to join on>).

Yes individual queries return the values, I’m using this query to get a specific value

sum(k8s_container_cpu_request{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”} and on(k8s_node_name) k8s_pod_phase{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”} == 2)

Output is 438 and the total allocated memory for those servers is 432.

Query C : sum(k8s_node_allocatable_cpu{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”})
output: 432

I assume you checked that the timeseries that are then added to give 438 are correctly filtered. I think it is possible that you are not using the correct label to do the and. Given what I see it is my take that you are still adding all the k8s_container_cpu_request timeseries as long as there is k8s_pod_phase in that k8s_node_name that has the value 2.

There needs to be something more specific to link the time-series in k8s_container_cpu_request and k8s_pod_phase than the node name. Isn’t there a pod name or something like that?

It got resolved when I add

k8s_container_ready{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”} == 1

sum(k8s_container_cpu_request{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”} and k8s_container_ready{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”} == 1) / sum(k8s_node_allocatable_cpu{k8s_cluster_name=“$cluster_name”, k8s_node_name=~“.vm.”} ) * 100

instead of k8s_pod_phase