Kubernetes Infra Metrics and Logs Collection

Overview

To export Kubernetes metrics, you can enable different receivers in OpenTelemetry collector which will send metrics about your Kubernetes infrastructure to SigNoz. These OpenTelemetry collectors will act as agents which send metrics about Kubernetes to SigNoz.

OtelCollector agent can also be used to tail and parse logs generated by container using filelog receiver and send it to desired receiver.

K8s-Infra helm chart mainly does the following:

  • Tails and parses logs generated by containers in Kubernetes cluster and sends to SigNoz
  • Collects kubelet metrics and host metrics from each nodes of the Kubernetes cluster
  • Collects cluster-level metrics from the Kubernetes API server
  • Acts as a gateway to send any incoming OTLP telemetry data to SigNoz OtelCollector

Based on how you are running SigNoz (e.g. SigNoz Cloud, in an independent VM or Kubernetes cluster), you have to provide the address to send data from the above receivers.

Install K8s-Infra chart

To add the SigNoz Helm repository to your helm client, run the following command:

helm repo add signoz https://charts.signoz.io

If the chart is already present, update the chart to the latest version:

helm repo update

For generic Kubernetes clusters, you can use the following configuration:

override-values.yaml

global:
  cloud: others
  clusterName: <CLUSTER_NAME>
  deploymentEnvironment: <DEPLOYMENT_ENVIRONMENT>
otelCollectorEndpoint: ingest.{region}.signoz.cloud:443
otelInsecure: false
signozApiKey: <SIGNOZ_INGESTION_KEY>
presets:
  otlpExporter:
    enabled: true
  loggingExporter:
    enabled: false

Depending on the choice of your region for SigNoz cloud, the ingestion endpoint will vary according to this table.

RegionEndpoint
USingest.us.signoz.cloud:443
INingest.in.signoz.cloud:443
EUingest.eu.signoz.cloud:443
📝 Note
  • Replace <SIGNOZ_INGESTION_KEY> with the one provided by SigNoz.
  • Replace <CLUSTER_NAME> with the name of the Kubernetes cluster or a unique identifier of the cluster.
  • Replace <DEPLOYMENT_ENVIRONMENT> with the deployment environment of your application. Example: "staging", "production", etc.

To install the k8s-infra chart with the above configuration, run the following command:

helm install my-release signoz/k8s-infra -f override-values.yaml

Send Data from Instrumented Applications

Data flow from your application to SigNoz

OpenTelemetry instrumented application sends data to OTelAgent Daemon deployed in your k8s infra. The OTelAgent daemon sends the collected data to SigNoz.

📝 Note

In case of GKE Autopilot, you will not be able to send data to OTelAgent Daemon via host port. You will need to use either the SigNoz ingestion endpoint directly or OtelAgent service name.

For OtelAgent service name, the endpoint would be something like my-release-k8s-infra-otel-agent.default.svc:4317. Replace my-release with your helm release name and default with your namespace.

To send data from your applications, you must first instrument it with OpenTelemetry. You can find instrumentation instructions for your specific language here.

Once you're done instrumenting your application, add below to your application manifest files for applications to start sending data to the otel-collectors running as DaemonSet.

For example, you can add the below config to your application manifest file.

env:
  - name: HOST_IP
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP
  - name: K8S_POD_IP
    valueFrom:
      fieldRef:
        apiVersion: v1
        fieldPath: status.podIP
  - name: K8S_POD_UID
    valueFrom:
      fieldRef:
        fieldPath: metadata.uid
  - name: OTEL_EXPORTER_OTLP_INSECURE
    value: "true"
  - name: OTEL_EXPORTER_OTLP_ENDPOINT
    value: $(HOST_IP):4317
  - name: OTEL_RESOURCE_ATTRIBUTES
    value: service.name=APPLICATION_NAME,k8s.pod.ip=$(K8S_POD_IP),k8s.pod.uid=$(K8S_POD_UID)
📝 Note
  • Replace APPLICATION_NAME with your application name that you wish to see in SigNoz.
  • In cases of some SDKs, you would need to include http:// or https:// prefix for OTEL_EXPORTER_OTLP_ENDPOINT
  • You can also include deployment.environment as an attribute in OTEL_RESOURCE_ATTRIBUTES environment variable. This attribute will take precedence over global.deploymentEnvironment configuration of k8s-infra chart.

Disable Logs Collection

In case you do not want to collect logs from your Kubernetes cluster, you can disable using presets in k8s-infra chart.

presets:
  logsCollection:
    enabled: false

Disable Metrics Collection

In case you do not want to collect metrics from your Kubernetes cluster, you can disable using presets in k8s-infra chart.

presets:
  hostMetrics:
    enabled: false
  kubeletMetrics:
    enabled: false
  clusterMetrics:
    enabled: false

otelDeployment:
  enabled: false

Plot Metrics in SigNoz UI

To plot metrics generated from k8s-infra chart, follow the instructions given in the docs here.

Check out the List of metrics from Kubernetes receiver.

Here are some examples of metrics dashboard.

  1. Import Dashboard with PVC Metrics

    You can import dashboard with PVC metrics of Kubernetes cluster from here.

  2. Import Dashboard with Overall Kubernetes pods Metrics

    You can import dashboard with the general Kubernetes pods metrics of your K8s cluster from here.

  3. Import Dashboard with Detailed Kubernetes pods Metrics

    You can import dashboard with more detailed granular Kubernetes pods metrics of your K8s cluster from here.

  4. Import Dashboard with Overall Kubernetes Node Metrics

    You can import dashboard with the general Kubernetes node metrics of your K8s cluster from here.

  5. Import Dashboard with Detailed Kubernetes Node Metrics

    You can import dashboard with more detailed granular Kubernetes node metrics of your K8s cluster from here.

In the Dashboard page of SigNoz UI, you can create your own widgets as per you need using metrics from the list below.


List of metrics

Kubernetes Metrics - kubeletstats and k8s_cluster

  • container_cpu_time
  • container_cpu_utilization
  • container_filesystem_available
  • container_filesystem_capacity
  • container_filesystem_usage
  • container_memory_available
  • container_memory_major_page_faults
  • container_memory_page_faults
  • container_memory_rss
  • container_memory_usage
  • container_memory_working_set
  • k8s_container_cpu_limit
  • k8s_container_cpu_request
  • k8s_container_memory_limit
  • k8s_container_memory_request
  • k8s_container_ready
  • k8s_container_restarts
  • k8s_daemonset_current_scheduled_nodes
  • k8s_daemonset_desired_scheduled_nodes
  • k8s_daemonset_misscheduled_nodes
  • k8s_daemonset_ready_nodes
  • k8s_deployment_available
  • k8s_deployment_desired
  • k8s_job_active_pods
  • k8s_job_desired_successful_pods
  • k8s_job_failed_pods
  • k8s_job_max_parallel_pods
  • k8s_job_successful_pods
  • k8s_namespace_phase
  • k8s_node_condition_memory_pressure
  • k8s_node_condition_ready
  • k8s_node_cpu_time
  • k8s_node_cpu_utilization
  • k8s_node_filesystem_available
  • k8s_node_filesystem_capacity
  • k8s_node_filesystem_usage
  • k8s_node_memory_available
  • k8s_node_memory_major_page_faults
  • k8s_node_memory_page_faults
  • k8s_node_memory_rss
  • k8s_node_memory_usage
  • k8s_node_memory_working_set
  • k8s_node_network_errors
  • k8s_node_network_io
  • k8s_pod_cpu_time
  • k8s_pod_cpu_utilization
  • k8s_pod_filesystem_available
  • k8s_pod_filesystem_capacity
  • k8s_pod_filesystem_usage
  • k8s_pod_memory_available
  • k8s_pod_memory_major_page_faults
  • k8s_pod_memory_page_faults
  • k8s_pod_memory_rss
  • k8s_pod_memory_usage
  • k8s_pod_memory_working_set
  • k8s_pod_network_errors
  • k8s_pod_network_io
  • k8s_pod_phase
  • k8s_replicaset_available
  • k8s_replicaset_desired
  • k8s_statefulset_current_pods
  • k8s_statefulset_desired_pods
  • k8s_statefulset_ready_pods
  • k8s_statefulset_updated_pods
  • k8s_volume_available
  • k8s_volume_capacity
  • k8s_volume_inodes
  • k8s_volume_inodes_free
  • k8s_volume_inodes_used
  • k8s_node_allocatable_cpu
  • k8s_node_allocatable_memory

Hostmetrics

  • system_network_connections
  • system_disk_weighted_io_time
  • system_disk_merged
  • system_disk_operation_time
  • system_disk_pending_operations
  • system_disk_io_time
  • system_disk_operations
  • system_disk_io
  • system_filesystem_inodes_usage
  • system_filesystem_usage
  • system_cpu_time
  • system_memory_usage
  • system_network_packets
  • system_network_dropped
  • system_network_io
  • system_network_errors
  • system_cpu_load_average_5m
  • system_cpu_load_average_15m
  • system_cpu_load_average_1m

Was this page helpful?