Hybrid Data Pipeline provides logging capabilities to help administrators monitor Hybrid Data Pipeline services and troubleshoot issues. By default, the logs directory and the log files themselves are written to a persistent volume named logs. Administrators may also specify a logs directory and configure log levels for several services using the hdplogging.properties file. The following sections provide information about working with logs in a Kubernetes environment.

Configure the persistent volume

A persistent volume for storing logs is mounted to the Kubernetes container node by default. You may change the default configuration by modifying relevant parameters in the Helm chart manifest file. As this section of the manifest file shows, you may enable or disable the feature, change the mount path, increase disk space, and specify a storage class.


    logs:
      enabled: true
      mountPath: /logs
      size: 1Gi
      storageClassName: azurefile-csi

By default, 1 GiB of node disk space is allocated for the persistent volume. However, the allocated disk space may need to be increased depending on server load, logging levels, and log cleanup.

Warning: Enabling a persistent volume for logs is generally the best option for isolation and log processing. However, you may disable this feature. When the deployment of a persistent volume is disabled, logs are not persisted and therefore may be lost when a node is shutdown or terminated. In addition, logs are written separately to a logs directory (ddcloud-home/logs) in each container.

Update the hdplogging.properties file

You may specify a logs location and configure log levels with the hdplogging.properties file. In a Kubernetes deployment, the logs location is written to the persistent volume /logs by default.

Take the following steps to update the hdplogging.properties file in a Kubernetes deployment.

  1. Copy the hdplogging.properties file from the shared directory to a local machine.

    kubectl cp hdp-hdpserver-0:hdpshare/hdplogging.properties hdplogging.properties --namespace namespace-value

  2. Edit the hdplogging.properties file with your preferred settings. See Log management for details.
  3. Push the hdplogging.properties file to the shared location.

    kubectl cp hdplogging.properties hdp-hdpserver-0:hdpshare --namespace namespace-value

  4. Restart the cluster by running the kubectl delete command on each pod. After it is deleted, each pod is restarted based on the Helm chart configuration. For example:
    
    # Restart the first pod
    kubectl delete hdp-hdpserver-0
    
    # Restart the second pod
    kubectl delete hdp-hdpserver-1
    
    # Restart the third pod
    kubectl delete hdp-hdpserver-2
    

Access system logs in the deployment

You may obtain system logs for Hybrid Data Pipeline services using the Web UI or the Nodes API, as described in Obtaining system logs.

You may also view container logs directly by opening a shell on a running container. For example, you would use the following kubectl command to open a shell on the hdp-hdpserver-0 pod in the hdp-k8s-cluster namespace.

kubectl --namespace hdp-k8s-cluster exec -it hdp-hdpserver-0 -- bash

After the Bash opens, navigate to the logs directory to view system logs. In a Kubernetes deployment, the logs location is written to the persistent volume /logs by default.