Troubleshoot

Diagnose and troubleshoot problems with your StackRox Kubernetes Security Platform deployment.

2 minute read

Sometimes issues may occur with your StackRox Kubernetes Security Platform deployment that you may be able to fix through troubleshooting. Use the information in this section to troubleshoot and diagnose issues with the StackRox Kubernetes Security Platform.

Diagnose the problem

The first step in troubleshooting is to identify the problem. Start by checking the status of your pods.

  1. Use the get pod command with the -o wide option. This option lists more information, including the pod’s statues, node on which the pod resides, and the pod’s cluster IP.

    Copy
    $ kubectl -n stackrox get pod -o wide
    NAME                       READY   STATUS    RESTARTS   AGE     IP           NODE                         NOMINATED NODE
    central-6fc475c84c-gpmjz   1/1     Running   0          5h14m   10.44.1.28   default-pool-aaee890e-xqn1   <none>
    collector-5zdkf            1/1     Running   0          5h13m   10.44.0.30   default-pool-aaee890e-p96z   <none>
    collector-fq5mk            1/1     Running   0          5h13m   10.44.2.32   default-pool-aaee890e-r5m3   <none>
    collector-x8mbd            1/1     Running   0          5h13m   10.44.1.29   default-pool-aaee890e-xqn1   <none>
    scanner-594997dc9f-msfss   1/1     Running   0          5h14m   10.44.1.27   default-pool-aaee890e-xqn1   <none>
    sensor-75cd875c86-57lrf    1/1     Running   0          5h13m   10.44.2.31   default-pool-aaee890e-r5m3   <none>
    Copy
    $ oc -n stackrox get pod -o wide
    NAME                       READY   STATUS    RESTARTS   AGE     IP           NODE                         NOMINATED NODE
    central-6fc475c84c-gpmjz   1/1     Running   0          5h14m   10.44.1.28   default-pool-aaee890e-xqn1   <none>
    collector-5zdkf            1/1     Running   0          5h13m   10.44.0.30   default-pool-aaee890e-p96z   <none>
    collector-fq5mk            1/1     Running   0          5h13m   10.44.2.32   default-pool-aaee890e-r5m3   <none>
    collector-x8mbd            1/1     Running   0          5h13m   10.44.1.29   default-pool-aaee890e-xqn1   <none>
    scanner-594997dc9f-msfss   1/1     Running   0          5h14m   10.44.1.27   default-pool-aaee890e-xqn1   <none>
    sensor-75cd875c86-57lrf    1/1     Running   0          5h13m   10.44.2.31   default-pool-aaee890e-r5m3   <none>

    The STATUS column lists the status of the pods.

The following sections gives suggestions on how to troubleshoot some common issues based on the status of the pods. For more complex problems, we recommend you follow the instructions in the Next steps section and reach out to StackRox support.

Pod status is ErrImagePull or ImagePullBackOff

The ErrImagePull and ImagePullBackOff statuses are related. Both of these statuses indicates that your Kubernetes or OpenShift cluster can’t retrieve the StackRox Kubernetes Security Platform components’ images.

  • When Kubernetes (or OpenShift) attempts to pull the image from the registry, and it encounters an error in pulling the image, it shows ErrImagePull as the status message.
  • Similarly, the ImagePullBackOff message indicates that when Kubernetes (or OpenShift) attempts to pull the image and fails, it tries to pull the image multiple times with increasing intervals between every request.

To troubleshoot this issue:

  1. Use the describe command to get more details about a specific pod:

    Copy
    kubectl -n stackrox describe pod <pod-name>
    Copy
    oc -n stackrox describe pod <pod-name>

    Look at the Events section of the output to get more information about the problem.

  2. The common reasons for these errors are:

    • The image tag is incorrect. Verify that the image tag you are using is correct.
    • The image doesn’t exist in the registry. Verify that you are using the correct image registry and you can pull the image locally.
    • Kubernetes (or OpenShift) couldn’t authenticate with the registry. Verify that you can pull images from the stackrox.io registry.

Pod status is CrashLoopBackOff

The CrashLoopBackOff error occurs when a pod is crashing during startup, and Kubernetes (or OpenShift) tries to restart it on a repeated schedule with increasing intervals between every request.

To troubleshoot this issue:

  1. Use the describe command to get more details about a specific pod:

    Copy
    kubectl -n stackrox describe pod <pod-name>
    Copy
    oc -n stackrox describe pod <pod-name>

    The output of the describe pod command shows the state of the container (or containers) on the pod along with their exit code.

Next steps

If none of the above troubleshooting steps helps:

  1. Follow the instructions in the Generate a diagnostic bundle section to create a diagnostic bundle for the StackRox Kubernetes Security Platform.
  2. Send us the diagnostic data when you create a support ticket. Always use an encrypted channel (Slack file send or Zendesk file upload) when you send this data to us.

Questions?

We're happy to help! Reach out to us to discuss questions, issues, or feature requests.

© 2021 StackRox Inc. All rights reserved.