How to Setup High-Availability Load Balancer with ‘HAProxy’ to Control Web Server Traffic

HAProxy stands for High Availability proxy. It is a Free and open source application written in C programming Language. HAProxy application is used as TCP/HTTP Load Balancer and for proxy Solutions. The most common use of the HAProxy application is to distribute the workload across multiple servers e.g., web server, database server, etc thus improving the overall performance and reliability of server environment.

The highly efficient and fast application is used by many of the world’s reputed organization which includes but not limited to – Twitter, Reddit, GitHub and Amazon. It is available for Linux, BSD, Solaris and AIX platform.

Install HAProxy in Linux
Install HAProxy Load Balancer in Linux

In this tutorial, we will discuss the process of setting up a high availability load balancer using HAProxy to control the traffic of HTTP-based applications (web servers) by separating requests across multiple servers.

For this article, we’re using the most recent stable release of HAProxy version i.e. 1.5.10 released on December 31st 2014. And also we’re using CentOS 6.5 for this setup, but the below given instructions also works on CentOS/RHEL/Fedora and Ubuntu/Debian distributions.

My Environment Setup

Here our load-balancer HAProxy server having hostname as websrv.tecmintlocal.com with IP address 192.168.0.125.

HAProxy Server Setup
Operating System	:	CentOS 6.5
IP Address		: 	192.168.0.125
Hostname		: 	websrv.tecmintlocal.com
Client Web Servers Setup

The other four machines are up and running with web servers such as Apache.

Web Server #1 :	CentOS 6.5 [IP: 192.168.0.121] - [hostname: web1srv.tecmintlocal.com]
Web Server #2 :	CentOS 6.5 [IP: 192.168.0.122] - [hostname: web2srv.tecmintlocal.com]
Web Server #3 :	CentOS 6.5 [IP: 192.168.0.123] - [hostname: web3srv.tecmintlocal.com]
Web Server #4 :	CentOS 6.5 [IP: 192.168.0.124] - [hostname: web4srv.tecmintlocal.com]

Step 1: Installing Apache on Client Machines

1. First we have to install Apache in all four server’s and share any one of site, for installing Apache in all four server’s here we going to use following command.

# yum install httpd		[On RedHat based Systems]
# apt-get install apache2	[On Debian based Systems]

2. After installing Apache web server on all four client machines, you can verify anyone of the server whether Apache is running by accessing it via IP address in browser.

http://192.168.0.121
Check Apache Status
Check Apache Status

Step 2: Installing HAProxy Server

3. In most of the today’s modern Linux distributions, HAPRoxy can be easily installed from the default base repository using default package manager yum or apt-get.

For example, to install HAProxy on RHEL/CentOS/Fedora and Debian/Ubuntu versions, run the following command. Here I’ve included openssl package too, because we’re going to setup HAProxy with SSL and NON-SSL support.

# yum install haproxy openssl-devel	[On RedHat based Systems]
# apt-get install haproxy		[On Debian based Systems]

Note: On Debian Whezzy 7.0, we need to enable the backports repository by adding a new file backports.list under “/etc/apt/sources.list.d/” directory with the following content.

# echo "deb http://cdn.debian.net/debian wheezy-backports main" >> /etc/apt/sources.list.d/backports.list

Next, update the repository database and install HAProxy.

# apt-get update
# apt-get install haproxy -t wheezy-backports

Step 3: Configure HAProxy Logs

4. Next, we need to enable logging feature in HAProxy for future debugging. Open the main HAProxy configuration file ‘/etc/haproxy/haproxy.cfg‘ with your choice of editor.

# vim /etc/haproxy/haproxy.cfg

Next, follow the distro-specific instructions to configure logging feature in HAProxy.

On RHEL/CentOS/Fedora

Under #Global settings, enable the following line.

log         127.0.0.1 local2
On Ubuntu/Debian

Under #Global settings, replace the following lines,

log /dev/log        local0
log /dev/log        local1 notice 

With,

log         127.0.0.1 local2
Enable HAProxy Logging
Enable HAProxy Logging

5. Next, we need to enable UDP syslog reception in ‘/etc/rsyslog.conf‘ configuration file to separate log files for HAProxy under /var/log directory. Open your your ‘rsyslog.conf‘ file with your choice of editor.

# vim /etc/rsyslog.conf

Uncommnet ModLoad and UDPServerRun, Here our Server will listen to Port 514 to collect the logs into syslog.

# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514
Configure HAProxy Logging
Configure HAProxy Logging

6. Next, we need to create a separate file ‘haproxy.conf‘ under ‘/etc/rsyslog.d/‘ directory to configure separate log files.

# vim /etc/rsyslog.d/haproxy.conf

Append following line to the newly create file.

local2.*	/var/log/haproxy.log
HAProxy Logs
HAProxy Logs

Finally, restart the rsyslog service to update the new changes.

# service rsyslog restart 
Babin Lonston
I'm Working as a System Administrator for last 10 year's with 4 years experience with Linux Distributions, fall in love with text based operating systems.

Each tutorial at TecMint is created by a team of experienced Linux system administrators so that it meets our high-quality standards.

Join the TecMint Weekly Newsletter (More Than 156,129 Linux Enthusiasts Have Subscribed)
Was this article helpful? Please add a comment or buy me a coffee to show your appreciation.

29 thoughts on “How to Setup High-Availability Load Balancer with ‘HAProxy’ to Control Web Server Traffic”

  1. Nice tutorial, I have request for next tutorial about haproxy. how to force redirect when visitor try access http url redirect to https url, how to configuring to increase secure haproxy and how to tune that tool, the last.

    how to upgrade/update to the latest version of haproxy in centos 7 which is that os use 1.5 veersion. thanks in advanced

    Reply
    • @Vincent,

      Just you need to have a valid DNS entry.

      And your hostname/IP A, AAA, PTR record should be created in DNS Server.

      Thanks & Regards,
      Bobin Lonston

      Reply
  2. The article is really good to understand quite good stuff, I got. I have a question, while testing the load balancer if it is working fine why we are accessing loadbalancer IP address where we have created all index.html file in the 4 web server.

    So the testing should be like when we access the web server by name we should be able to see from the log that which web server out of 4 it is using to server the request, isn’t it?

    Reply
    • @Vikas,

      Thanks for finding this article very useful and I am glad that you know now about HAProxy work flow..keep visiting Tecmint for more such quality articles..

      Reply
      • It is a good article , but I have some question regarding the article

        1- what is the difference between tcp and http mode in haproxy .. I am confused on this topic.

        2- as you have mentioned this article as high availability with HA proxy .. this architecture does not seems high availability ,as per my view as you are using a single load balancer here , in a case if it gets down then architecture will fails as web servers are communicating with a single load balancers .. so it not seems a high availability architecture as my side .

        Reply
  3. Hi Babin ,

    It is a good article , but I have some question regarding the article

    1- what is the difference between tcp and http mode in haproxy .. I am confused on this topic.

    2- as you have mentioned this article as high availability with HA proxy .. this architecture does not seems high availability ,as per my view as you are using a single load balancer here , in a case if it gets down then architecture will fails as web servers are communicating with a single load balancers .. so it not seems a high availability architecture as my side .

    I am not criticizing , article is good , but I was expecting little more from you ..

    thanks ,
    Ashish

    Reply
  4. Hi Nice tutorial , just have one question
    Where you are installing SSL, on load-balancer or webserver,
    and if in load-balancer why you need to install apache2 ?
    Please need this information

    Reply
  5. Great article!!! I like very much but I have questions…

    Can I configure LBS in other section without redirect http to https? And…

    I have different subdomain, certificates in the webserver…Now I use the heartbeat + ipvsadmin for the load balance and failover.

    Can I configure haproxy for use all certificates my websites?

    Example sites:

    example.es

    example.com

    example.net

    example.br

    etc…

    Reply
    • I put this configuration in haproxy.cfg and It works very well in http and https… could improve cache SSL connections to users or something?

      ———————-

      # GLOBAL SETTINGS

      global
      log 127.0.0.1 local2
      tune.ssl.default-dh-param 2048
      chroot /var/lib/haproxy
      user haproxy
      group haproxy
      daemon
      stats socket /var/run/haproxy.sock mode 600

      #———————————————————————
      defaults
      mode http
      log global
      option dontlognull
      option http-server-close
      option redispatch
      retries 3
      timeout http-request 20
      timeout queue 86400
      timeout connect 86400
      timeout client 86400
      timeout server 86400
      timeout http-keep-alive 30
      timeout check 20
      maxconn 50000

      frontend LB
      bind *:80
      mode tcp
      default_backend LB

      frontend LBS
      bind *:443
      mode tcp
      reqadd X-Forwarded-Proto:\ https
      default_backend LBS

      backend LB
      mode tcp
      balance roundrobin # Load balancing will work in round-robin process.
      server web1 192.168.0.203:80 check # backend server.
      server web2 192.168.0.204:80 check # backend server.
      # server web4-srv 192.168.0.124:80 check backup # backup fail-over Server, If three of the above fails this will be activated.

      backend LBS
      mode tcp
      stick-table type ip size 200k expire 30m
      stick on src
      default-server inter 1s
      balance roundrobin # Load balancing will work in round-robin process.
      #option ssl-hello-chk
      server web1 192.168.0.203:443 check # backend server.
      server web2 192.168.0.204:443 check # backend server.
      # server web4-srv 192.168.0.124:80 check backup # backup fail-over Server, If three of the above fails this will be activated.

      Reply
  6. What if the Load Balancer itself becomes unavailable (power outage, connection outage, etc), isn’t all of the backend servers will also become unavailable as well? What would you recommend for such a situation?

    Reply
    • You should have fallout server, so like there’s Active and passive Loadbalancer with same configuration. just install the apache heatbeat so once the Active is down the passive will take over.

      Reply
  7. I have been using HAProxy for several years for our intranet, and have just recently converted half to ssl, but not able to get the forwarding correct. Your tutorial here shows how to add ssl to the load balancer, but will the ssl need to be installed on the webservers as well? And will the certificate have to be the same on all servers?

    Reply
    • No, Only we have to install SSL in LB.. There are two types of SSL configuration in HAProxy SSL termination or SSL pass through

      Reply
  8. How does data replicate between the web servers. We have users upload some picture which are stored in a specific location on the web server. If the users connection lands on different web server, those files are not found. How to manage this?

    Reply
    • @Suman we using 7 Server’s for our deployment with NFS for files storage, No idea about your issue will review about your issue

      Reply
    • You can also setup an SSHFS connection (which is basically just like a dropbox), mount it as a virtual hard drive on your servers, when a file is being placed on that drive, it will be sync-ed across all of the other drives on other servers.

      Reply
    • @snehal HAProxy’s website show that they can support over a million connections a second with sessions. “Load” is determined by what your site is hosting and how many connections are being served. “Load” also is determined whether the site is dynamic or static. My use is a multi-master LAMP cluster. From an independent computer, I load a php script that writes to the databases as fast as it can execute. Each loop that it goes thru, has its own connection, insert, and disconnect. At 10,000 loops, the timestamps on the records between the first and the last are measured in milliseconds and are evenly split between the masters. Now when I access the site directly thru the load balanced virtual ip(which it is), the connection to the database is a one to one connection, meaning that it connects only to one server for the data, even thru a load balancer. This is the same for static pages. The load balancer spreads the load across multiple machines so that the SPF(Single Point of Failure) can be mitigated and that “load” is off set to other machines.

      Reply

Leave a Reply to Joseph Cancel reply

Thank you for taking the time to share your thoughts with us. We appreciate your decision to leave a comment and value your contribution to the discussion. It's important to note that we moderate all comments in accordance with our comment policy to ensure a respectful and constructive conversation.

Rest assured that your email address will remain private and will not be published or shared with anyone. We prioritize the privacy and security of our users.