> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ankra.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Cluster Metrics

> Visualize real-time CPU, memory, network, and disk metrics for your Kubernetes clusters.

<Note>
  The Cluster Metrics page provides real-time visualization of your cluster's resource usage. Connect a Prometheus data source to enable comprehensive metrics monitoring.
</Note>

## Overview

Ankra's metrics visualization helps you understand your cluster's health and resource consumption at a glance. View CPU usage, memory consumption, network throughput, disk I/O, and pod restart patterns, all from a single dashboard.

***

## Accessing Cluster Metrics

1. Navigate to your cluster from the sidebar.
2. Click **Metrics** in the cluster navigation.
3. View real-time charts and summary cards.

<Info>
  Metrics require a configured Prometheus data source. See [Prometheus Integration](/integrations/prometheus) or [Cluster Settings](/platform/cluster-settings) to configure your data source.
</Info>

***

## Summary Cards

At the top of the metrics page, summary cards show current resource utilization:

| Card             | Description                                      |
| ---------------- | ------------------------------------------------ |
| **CPU Usage**    | Current CPU utilization vs total capacity        |
| **Memory Usage** | Current memory consumption vs total capacity     |
| **Network Rate** | Combined inbound and outbound network throughput |
| **Pod Status**   | Running pods vs total pod count                  |

These cards provide instant visibility into cluster health without scrolling through detailed charts.

***

## Metrics Charts

### CPU Usage

Visualize CPU consumption across your cluster nodes over time.

**What it shows:**

* Per-node CPU usage lines
* Total cluster CPU trend
* Usage patterns over the selected time range

**Use cases:**

* Identify nodes with consistently high CPU usage
* Spot CPU spikes that correlate with deployments or traffic
* Plan capacity based on usage trends

### Memory Usage

Track memory consumption across your cluster.

**What it shows:**

* Per-node memory usage
* Memory pressure indicators
* Historical memory trends

**Use cases:**

* Detect memory leaks in applications
* Identify nodes approaching memory limits
* Plan memory allocation for new workloads

### Network I/O

Monitor network traffic flowing in and out of your cluster.

**What it shows:**

* **Receive**: Inbound network traffic per node
* **Transmit**: Outbound network traffic per node
* Throughput rates in bytes/second

**Use cases:**

* Identify network-intensive workloads
* Detect unusual traffic patterns
* Monitor data transfer costs

### Disk I/O

Track disk read and write operations across nodes.

**What it shows:**

* Disk read throughput
* Disk write throughput
* I/O patterns per node

**Use cases:**

* Identify storage bottlenecks
* Monitor database disk activity
* Plan storage capacity

### Pod Restarts

Monitor pod restart frequency to detect stability issues.

**What it shows:**

* Pod restart counts over time
* Restart patterns by namespace or workload
* Correlation with other events

**Use cases:**

* Detect crashlooping pods
* Identify unstable deployments
* Troubleshoot OOMKilled containers

***

## Time Range Selection

Control the time window for metrics data:

| Range               | Use Case               |
| ------------------- | ---------------------- |
| **Last 15 minutes** | Real-time monitoring   |
| **Last hour**       | Recent activity review |
| **Last 6 hours**    | Shift-based monitoring |
| **Last 24 hours**   | Daily patterns         |
| **Last 7 days**     | Weekly trends          |
| **Custom**          | Specific time windows  |

### Changing Time Range

1. Click the **Time Range** picker in the top-right corner
2. Select a preset range or define a custom window
3. Charts automatically update to show the selected period

***

## Auto-Refresh

Keep metrics up-to-date with automatic refresh:

| Interval       | Description             |
| -------------- | ----------------------- |
| **Off**        | Manual refresh only     |
| **10 seconds** | Near real-time updates  |
| **30 seconds** | Balanced refresh rate   |
| **1 minute**   | Low-overhead monitoring |
| **5 minutes**  | Background monitoring   |

### Manual Refresh

Click the **Refresh** button at any time to fetch the latest data immediately.

***

## Prometheus Configuration

Metrics require a connected Prometheus instance.

### Setting Up Prometheus

<Steps>
  <Step title="Install Prometheus">
    Deploy Prometheus to your cluster using a stack with the `kube-prometheus-stack` add-on, or connect to an existing Prometheus instance.
  </Step>

  <Step title="Configure Data Source">
    Go to cluster **Settings** → **Metrics** and enter your Prometheus URL.
  </Step>

  <Step title="Test Connection">
    Verify the connection is successful before saving.
  </Step>

  <Step title="View Metrics">
    Navigate to the **Metrics** page to see your cluster data.
  </Step>
</Steps>

### Default Prometheus URL

If using `kube-prometheus-stack` deployed via Ankra:

```
http://kube-prometheus-stack-prometheus.prometheus.svc:9090
```

***

## Troubleshooting

### "Prometheus Not Configured"

**Cause:** No Prometheus data source has been set up.

**Solution:**

1. Go to cluster **Settings** → **Metrics**
2. Configure your Prometheus URL
3. Return to the Metrics page

### "Unable to Load Metrics"

**Cause:** Connection to Prometheus failed.

**Solutions:**

* Verify Prometheus is running in your cluster
* Check the Prometheus URL is correct
* Ensure network connectivity between the Ankra agent and Prometheus
* Review Prometheus service account permissions

### Missing Data for Some Metrics

**Cause:** Prometheus may not be scraping all required metrics.

**Solutions:**

* Verify `node-exporter` is deployed for node metrics
* Check `kube-state-metrics` is running for Kubernetes metrics
* Review Prometheus scrape configurations

***

## Best Practices

<Tip>
  **Use appropriate time ranges**: For real-time debugging, use 15-minute windows. For capacity planning, use 7-day views.
</Tip>

<Tip>
  **Set up alerts**: Combine metrics visualization with [Alerts](/guides/alerts) to get notified when thresholds are exceeded.
</Tip>

<Tip>
  **Monitor during deployments**: Watch metrics during rollouts to catch performance regressions early.
</Tip>

<Tip>
  **Correlate with events**: Use the time range selector to align metrics with known incidents or changes.
</Tip>

***

## Related

<CardGroup cols={2}>
  <Card title="Prometheus Integration" icon="chart-line" href="/integrations/prometheus">
    Set up Prometheus for your cluster.
  </Card>

  <Card title="Alerts" icon="bell" href="/guides/alerts">
    Configure alerts based on metrics.
  </Card>

  <Card title="Cluster Settings" icon="gear" href="/platform/cluster-settings">
    Configure metrics data sources.
  </Card>

  <Card title="AI Troubleshooting" icon="robot" href="/platform/kubernetes-ai-troubleshooting">
    Use AI to analyze performance issues.
  </Card>
</CardGroup>
