When running Hybrid Data Pipeline behind a network load balancer with an On-Premises Connector, the load balancer must be configured to route requests for on-premises data sources to the correct server nodes.

There are two general steps involved in configuring your load balancer to support on-premises data access. First, a custom Access Control List must be created to direct requests for the On-Premises Connector to cluster nodes. Second, a backend notification pool that specifies the on-premises port for each cluster node must be created. The following instructions explain how an HAProxy load balancer can be configured to support Hybrid Data Pipeline access to backend data sources using the On-Premises Connector. These instructions may be adapted for other load balancers, such as NGINX and F5.

The Hybrid Data Pipeline installation program automatically generates an HAProxy configuration file for each installation of the server. These HAProxy configuration files are written to the HAProxy subdirectory in the key location directory specified during installation. These files must be merged to create a single HAProxy configuration file for a load balancer deployment of Hybrid Data Pipeline.

Take the following steps to create an HAProxy configuration file for a load balancer deployment using the On-Premises Connector.

Note: The following samples do not include the configuration of server-side SSL. For details about SSL configuration, see SSL configuration. For information about required access ports, see Access ports.
  1. Create an Access Control List (ACL) to direct requests for the On-Premises Connector to each Hybrid Data Pipeline server.
    Note: Options 1 and 2 below may be used in combination.
    • Option 1. Use a custom header to direct requests. Each entry should be prefaced with acl.

      In this example, the custom header X-DataDirect-OPC-Host is used to direct requests to the server service2.myserver.com through the default On-Premises Port 40501.

      acl is_opa_hdr_service2_myserver_com_40501 hdr(X-DataDirect-OPC-Host) 
      -i opa_service2_myserver_com_40501
      use_backend opa_service2_myserver_com_40501 if is_opa_hdr_service2_myserver_com_40501
    • Option 2. Use URL routing to direct requests. Each entry should be prefaced with acl.

      In this example, URL routing is used to direct requests to the server service2.myserver.com through the default On-Premises Port 40501.

      acl is_opa_url_service2_myserver_com_40501 path_end 
      -i /connect/opa_service2_myserver_com_40501
      use_backend opa_service2_myserver_com_40501 if is_opa_url_service2_myserver_com_40501
  2. Add each Hybrid Data Pipeline server to the backend notification pool section using the server keyword.

    In the following example, the server server2.myserver.com has been added to the backend hdp_notification_pool section, and health checks have been enabled at the root with the option httpchk property.

    backend hdp_notification_pool
        mode http
        option http-tunnel
        balance roundrobin
        option httpchk HEAD /
        http-check expect status 200
    	
        #HDP Notification Server Definitions
        server server1.myserver.com 11.22.111.105:11280 check
        server server2.myserver.com 11.22.111.106:11280 check
  3. Create a backend pool that specifies the On-Premises Port for each Hybrid Data Pipeline server that supports the On-Premises Connector by adding a backend section to the configuration file.

    For example, the following backend section is for a node on the service2.myserver.com server using the default On-Premises Port 40501. Health checks have been enabled at the root with the option httpchk property.

    backend opa_service2_myserver_com_40501
        mode http
        option http-tunnel
        option httpchk HEAD /
        http-check expect status 200
        server service2.myserver.com 11.22.111.106:40501 check
  4. Add each Hybrid Data Pipeline server to the default backend pool using the server keyword.

    In the following example, server2.myserver.com has been added to the backend hdp_default_backend pool, and health checks have been enabled by specifying the /api/healthcheck endpoint with the option httpchk property.

    backend hdp_default_backend
         mode http
         balance roundrobin
         option httpchk HEAD /api/healthcheck
         http-check expect status 200
         cookie HDP_SESSION insert nocache
    	
         #HDP Server Definitions
         server service1.myserver.com 11.22.11.105:8080 check cookie service1.myserver.com
         server service2.myserver.com 11.22.111.106:8080 check cookie service2.myserver.com

Example

The following example demonstrates an HAProxy configuration file for using the load balancer with two server nodes that have the On-Premises connector enabled, server1.myserver.com and server2.myserver.com. To create this file, the required sections were copied from the generated configuration file for service2.myserver.com into the generated file for service1.myserver.com. Copied sections are indicated with comments.

global
        log 127.0.0.1 local0
        chroot /var/lib/haproxy
		
        daemon

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        timeout connect 5s
        timeout client  15m
        timeout server  15m

##############################################################################
# Configuration for OPC with load balancer.
##############################################################################
frontend lb_opc_nodes
    bind *:80
     #Replace /common/hdpsmoke/shared/redist/ddcloud.pem with the location of the 
     #loadbalancers SSL certificate
     bind *:443 ssl crt /common/hdpsmoke/shared/redist/ddcloud.pem
	
     #In production port 80 should be a permanent  redirected to 443 by uncommenting the 
     #following line
     #redirect scheme https code 301 if !{ ssl_fc }
	
    mode http
    default_backend hdp_default_backend
	
     #Define rules for HDP Notification Servers
     acl is_hdp_notification2 path_end -i /connect/X_DataDirect_Notification_Server
     use_backend hdp_notification_pool if is_hdp_notification2
	
     acl is_hdp_notification hdr(X-DataDirect-OPC-Host) -i X_DataDirect_Notification_Server
     use_backend hdp_notification_pool if is_hdp_notification
	
     #Rules for on-premises connection to service.myserver.com
     acl is_url_opa_service1_myserver_com_40501  path_end 
     -i /connect/opa_service1_myserver_com_40501
     use_backend opa_service1_myserver_com_40501 if is_url_opa_service1_myserver_com_40501	

     acl is_hdr_opa_service1_myserver_com_40501 hdr(X-DataDirect-OPC-Host) 
     -i opa_service1_myserver_com_40501
     use_backend opa_service1_myserver_com_40501 if is_hdr_opa_service1_myserver_com_40501
	
     #Rules for on-premises connection to service2.myserver.com. These rules were copied 
     #from the service2.myserer.com configuration file. 
     acl is_url_opa_service2_myserver_com_40501  path_end 
     -i /connect/opa_service2_myserver_com_40501
     use_backend opa_service2_myserver_com_40501 if is_url_opa_service2_myserver_com_40501	

     acl is_hdr_opa_service2_myserver_com_40501 hdr(X-DataDirect-OPC-Host) 
     -i opa_service2_myserver_com_40501
     use_backend opa_service2_myserver_com_40501 if is_hdr_opa_service2_myserver_com_40501

backend hdp_notification_pool
     mode http
     option http-tunnel
     balance roundrobin
     option httpchk HEAD /
     http-check expect status 200
	
	#HDP Notification Server Definitions
     server service1.myserver.com 11.22.111.105:11280 check
     #The following server argument was copied from the service2.myserver.com 
     #configuration file
     server service2.myserver.com 11.22.111.106:11280 check

	
backend opa_service1_myserver_com_40501
     mode http
     option http-tunnel
     option httpchk HEAD /
     http-check expect status 200
     server service1.myserver.com 11.22.111.105:40501 check
	
#The following section was copied from the service2.myserver.com configuration file.
backend opa_service2_myserver_com_40501
     mode http
     option http-tunnel
     option httpchk HEAD /
     http-check expect status 200
     server service2.myserver.com 11.22.111.106:40501 check

backend hdp_default_backend
     mode http
     balance roundrobin
     option httpchk HEAD /api/healthcheck
     http-check expect status 200
     cookie HDP_SESSION insert nocache
	
	#HDP Server Definitions
     server service1.myserver.com 11.22.11.105:8080 check cookie service1.myserver.com
     #The following server argument was copied from the service2.myserver.com 
     #configuration file
     server service2.myserver.com 11.22.111.106:8080 check cookie service2.myserver.com