Important: The instructions in this topic describe a general deployment with Docker. For instructions on the trial deployment, refer to the Trial Docker Deployment Getting Started Guide.

The Hybrid Data Pipeline server can be deployed using a Docker image. Before you begin, ensure all prerequisites have been met. Then, use the step-by-step procedure to configure and deploy Hybrid Data Pipeline.

Prerequisites

Before proceeding with a Docker deployment, you must have the following:

  • The Hybrid Data Pipeline Docker Deployment Package. The deployment package may be downloaded from the Progress Customer Downloads Portal or the Trial Download page.
  • Docker must be installed on the host machine. You must load and run the Docker image to deploy. For download and installation, refer to Install Docker Engine in Docker Docs.
  • External system database. Docker deployment requires an external system database. For details, see External system databases.
  • Private network. You must have a network on which to run the Docker container.
  • Available Ports. Ports must be specified in the docker run command. They must also be made available on the Docker host. The ports you specify depend on whether you are deploying the server behind a load balancer, and whether you are configuring the server for SSL or for on-premises connectivity. For more information, see Access ports, SSL configuration, and On-Premises Connector deployment configuration. Note that the On-Premises Connector requires SSL configuration of the Hybrid Data Pipeline server.
  • SSL certificates. The certificate files you provide depend, in part, on whether you are deploying the service with or without a load balancer. In addition to referring to the hdpdeploy.properties file, see SSL configuration for details.
  • Load balancer. A load balancer is required for cluster deployments. A load balancer may also be used to handle requests to a single node running the server. In either case, the load balancer should be configured to manage requests for the service. See Load balancer configuration for details.
  • Logs management. A log location and log levels can be configured using the hdplogging.properties file. If a log location is not configured, a default centralized location is used to generate and store logs. See Log management for details.

Step-by-step

To perform a deployment of the service, you must provide configuration information as configuration properties in the hdpdeploy.properties files, or as environment variables in the docker run command. These properties are comprehensively documented in the hdpdeploy.properties. Additional details may also be found in Deployment configurations.

Take the following steps to deploy the Docker image in your environment.

  1. Download the Hybrid Data Pipeline Docker Deployment Package from the Progress Customer Downloads Portal or the Trial Download page.
    Note: The deployment package is available as a tar.gz file. For example: PROGRESS_DATADIRECT_HDP_SERVER_version_DOCKER.tar.gz, where version is a three-part version number denoting major, minor, and service pack numbers.
  2. Unzip the Hybrid Data Pipeline Docker Deployment Package to a package directory.

    The hdp-docker-deploy directory includes the following folders and files.

    • demo: The demo folder includes example batch files to demonstrate how deployment may be automated for a number of deployment scenarios.
    • hdp: The hdp folder includes hdpdeploy.properties that is used to set deployment configuration options.
    • hdp_docker_version.build.tar.gz: This is the Hybrid Data Pipeline Docker image file, where version is the three-part version number of the server and build is the server build number.
    • hdp-docker-deploy-readme.txt: The readme includes general information and a detailed directory structure of the deployment package.
    • hdplogging.properties: This is the properties file, where custom locations for logs and logging levels can be configured.
  3. Run the docker load command to load the Docker image. For example:
    docker load -i hdp_docker_version.build.tar.gz

    where version is the three-part version number of the server, and build is the server build number.

  4. Create a shared file location on the local file system. This shared file location will be mounted to the container and will be used in the deployment and operation of the server.

    Windows

    For Docker on Windows, you may simply create the shared file location on the host. For example:

    C:\hdpshare

    Linux

    For Docker on Linux, you must create the shared file location, create a non-root user, and then change the ownership of the shared file location directory to the non-root user. For step-by-step instructions, see Creating the shared file location on Linux.

    Note:
    • The shared file location must be mapped to the local file system when you execute the docker run command to deploy the image.
    • If running Docker on Windows in Hyper-V mode, the drive that has the directory to be mounted must be marked as a shared drive in the Docker settings.
  5. For an SSL configuration that does not use the self-signed certificate, copy the SSL certificate files to be used with the Hybrid Data Pipeline server to the shared file location. See SSL configuration.
  6. If using a MySQL Community Edition database as the system database, copy a MySQL Connector/J driver to the shared file location.
  7. For Kerberos authentication against SQL Server, specify the path to the Kerberos configuration file krb5.conf with the HDP_KERBEROS_CONF_PATH property in the hdpdeploy.properties file.
  8. Optional. Configure a log location for logs and set logging levels.
    • Copy the hdplogging.properties file to the shared file location on the local file system.

      The hdplogging.properties file is located in the deployment package directory package_dir/hdp-docker-deploy/hdp, where package_dir is the directory into which you unzipped the contents of the deployment package.

    • Edit the hdplogging.properties file as needed. The comments in this file provide general guidance. For additional information, see Log management.
  9. Set deployment configuration properties.
    • Option 1. Set configuration properties in the hdpdeploy.properties file.
      • Copy the hdpdeploy.properties file to the shared file location on the local file system.

        The hdpdeploy.properties file is located in the deployment package directory package_dir/hdp-docker-deploy/hdp, where package_dir is the directory into which you unzipped the contents of the deployment package.

      • Edit the hdpdeploy.properties file to configure the deployment. The comments in this file provide detailed guidance. See Deployment configurations for additional information.
    • Option 2. Set configuration properties as environment variables in the docker run command. Refer to the comments in the hdpdeploy.properties file for details on the environment variables. All properties listed in this file can be set as environment variables. See Deployment configurations for additional information.
    Note:
    • You may configure one set of properties in the hdpdeploy.properties file and another set as environment variables.
    • If you are using the hdpdeploy.properties file, the file must be located in the shared file location before executing the docker run command.
    • If a property has been set both in the properties file and as an environment variable, then the environment variable value will override the value in the properties file.
  10. Run the Docker image to deploy the service. The following example shows a docker run command for a cluster deployment configured for SSL and the On-Premises Connector.
    Note:
    • This example assumes the use of the hdpdeploy.properties file to set most deployment configuration properties, including system database properties and SSL configuration properties.
    • The Docker port mappings use default ports. These include ports that are required for a deployment that uses an On-Premises Connector. Note that the use of an On-Premises Connector requires an SSL configuration. For more information, see Access ports, On-Premises Connector deployment configuration, and SSL configuration.
    • Hybrid Data Pipeline administrator and user passwords are set as environment variables.
    • The shared file location /hdpshare is mounted to the Docker host. The shared file location is the persistent volume used by the Docker container.
    docker run -dt -p 8443:8443 -p 8090:8090 -p 40501:40501 -p 11280:11280 -p 11443:11443 -e "ACCEPT_EULA=true" -e "HDP_ADMIN_PASSWORD=AdminSecret" -e "HDP_USER_PASSWORD=UserSecret" -v /home/users/username/hdpshare:/hdpshare --hostname DockerHost --name ContainerName hdp-docker-version:tag

    -p 8443:8443 Required for SSL deployment. Maps the Docker host port to the container port. Port 8443 is the HTTPS port for encrypted communication to the Hybrid Data Pipeline Web UI and API.

    -p 8090:8090 Required for load balancer deployment. Maps the Docker host port to the container port. Port 8090 is the HTTPS port for encrypted communication between individual nodes in a cluster deployment.

    -p 40501:40501 Required for the On-Premises Connector. Maps the Docker host port to the container port. Port 40501 is the port for communication between the On-Premises Connector and the Hybrid Data Pipeline server.

    -p 11280:11280 Required for the On-Premises Connector. Maps a Docker host port to a container port. Port 11280 is the port for communication from the On-Premises Connector to the Notification Server.

    -p 11443:11443 Required for the On-Premises Connector. Maps the Docker host port to the container port. Port 11443 is the TCP SSL port for encrypted communication from the On-Premises Connector to the Notification Server.

    -e "ACCEPT_EULA=true" Accepts the Hybrid Data Pipeline license agreement. For the agreement, refer to DataDirect License Agreement on the Progress website.

    -e "HDP_ADMIN_PASSWORD=AdminSecret" Sets the password for the default Hybrid Data Pipeline administrator d2cadmin. The AdminSecret is a user-specified password for the d2cadmin account.

    -e "HDP_USER_PASSWORD=UserSecret" Sets the password for the default Hybrid Data Pipeline user d2cuser. The UserSecret is a user-specified password for the d2cuser account.

    -v (Mount persistent volume)

    • Linux example

      -v /home/users/username/hdpshare:/hdpshare Mounts the shared location as a persistent volume to the Docker container's file system. The username is the host machine user account that is being used for the deployment.

    • Windows example

      -v C:\hdpshare:/hdpshare Mounts the shared location as a persistent volume to the Docker container's file system. The username is the host machine user account that is being used for the deployment.

    --hostname DockerHost Specifies the name of the Docker container host where DockerHost is the fully qualified hostname that is externally visible to components such as the JDBC driver, the ODBC driver, and the On-Premises Connector.

    --name ContainerName Specifies the name of the Docker container where ContainerName is a user-specified value.

    hdp-docker-version:tag The name of the Hybrid Data Pipeline Docker image where version is the three-part version number of the Hybrid Data Pipeline image, and tag is the Hybrid Data Pipeline build number.

  11. Optional. For cluster deployments, run the docker run command for each additional node in the cluster.
  12. Optional. If using a load balancer, configure the load balancer. See Load balancer configuration for details.
  13. Open the Web UI by entering the URL for your Hybrid Data Pipeline instance. For example:

    Non-Load Balancer

    https://dockerhost.example.com:8443/hdpui
    Note: For a non-load balancer deployment, the HTTPS port 8443 must be specified in the URL.

    Load Balancer

    https://load-balancer-host.example.com/hdpui

    Note: For a load balancer deployment, the port number must be either 80 for http or 443 for https. Whenever port 80 or 433 are used, it is not necessary to include the port number in the URL.
  14. Login to the default admin or user account.
    • d2cadmin is the name of the default admin account. The password must be specified in the hdpdeploy.properties file or as an environment variable in the docker run command.
    • d2cuser is the name of the default user account. The password must be specified in the hdpdeploy.properties file or as an environment variable in the docker run command.

Result

  • The Hybrid Data Pipeline server has been deployed based on the settings in the hdpdeploy.properties file, or the settings provided as environment variables.
  • The following four configuration and certificate files have been generated in the redist folder in the shared file location. These files must be used in the installation of the ODBC driver, the JDBC driver, and the On-Premises Connector.
    • config.properties
    • ddcloud.pem
    • ddcloudTrustStore.jks
    • OnPremise.properties

What to do next