HAProxy stands for High Availability proxy. It is a Free and open source application written in C programming Language. HAProxy application is used as TCP/HTTP Load Balancer and for proxy Solutions. The most common use of the HAProxy application is to distribute the workload across multiple servers e.g., web server, database server, etc thus improving the overall performance and reliability of server environment.
The highly efficient and fast application is used by many of the world’s reputed organization which includes but not limited to – Twitter, Reddit, GitHub and Amazon. It is available for Linux, BSD, Solaris and AIX platform.
In this tutorial, we will discuss the process of setting up a high availability load balancer using HAProxy to control the traffic of HTTP-based applications (web servers) by separating requests across multiple servers.
For this article, we’re using the most recent stable release of HAProxy version i.e. 1.5.10 released on December 31st 2014. And also we’re using CentOS 6.5 for this setup, but the below given instructions also works on CentOS/RHEL/Fedora and Ubuntu/Debian distributions.
My Environment Setup
Here our load-balancer HAProxy server having hostname as websrv.tecmintlocal.com with IP address 192.168.0.125.
HAProxy Server Setup
Operating System : CentOS 6.5 IP Address : 192.168.0.125 Hostname : websrv.tecmintlocal.com
Client Web Servers Setup
The other four machines are up and running with web servers such as Apache.
Web Server #1 : CentOS 6.5 [IP: 192.168.0.121] - [hostname: web1srv.tecmintlocal.com] Web Server #2 : CentOS 6.5 [IP: 192.168.0.122] - [hostname: web2srv.tecmintlocal.com] Web Server #3 : CentOS 6.5 [IP: 192.168.0.123] - [hostname: web3srv.tecmintlocal.com] Web Server #4 : CentOS 6.5 [IP: 192.168.0.124] - [hostname: web4srv.tecmintlocal.com]
Step 1: Installing Apache on Client Machines
1. First we have to install Apache in all four server’s and share any one of site, for installing Apache in all four server’s here we going to use following command.
# yum install httpd [On RedHat based Systems] # apt-get install apache2 [On Debian based Systems]
2. After installing Apache web server on all four client machines, you can verify anyone of the server whether Apache is running by accessing it via IP address in browser.
http://192.168.0.121
Step 2: Installing HAProxy Server
3. In most of the today’s modern Linux distributions, HAPRoxy can be easily installed from the default base repository using default package manager yum or apt-get.
For example, to install HAProxy on RHEL/CentOS/Fedora and Debian/Ubuntu versions, run the following command. Here I’ve included openssl package too, because we’re going to setup HAProxy with SSL and NON-SSL support.
# yum install haproxy openssl-devel [On RedHat based Systems] # apt-get install haproxy [On Debian based Systems]
Note: On Debian Whezzy 7.0, we need to enable the backports repository by adding a new file backports.list under “/etc/apt/sources.list.d/” directory with the following content.
# echo "deb http://cdn.debian.net/debian wheezy-backports main" >> /etc/apt/sources.list.d/backports.list
Next, update the repository database and install HAProxy.
# apt-get update # apt-get install haproxy -t wheezy-backports
Step 3: Configure HAProxy Logs
4. Next, we need to enable logging feature in HAProxy for future debugging. Open the main HAProxy configuration file ‘/etc/haproxy/haproxy.cfg‘ with your choice of editor.
# vim /etc/haproxy/haproxy.cfg
Next, follow the distro-specific instructions to configure logging feature in HAProxy.
On RHEL/CentOS/Fedora
Under #Global settings, enable the following line.
log 127.0.0.1 local2
On Ubuntu/Debian
Under #Global settings, replace the following lines,
log /dev/log local0 log /dev/log local1 notice
With,
log 127.0.0.1 local2
5. Next, we need to enable UDP syslog reception in ‘/etc/rsyslog.conf‘ configuration file to separate log files for HAProxy under /var/log directory. Open your your ‘rsyslog.conf‘ file with your choice of editor.
# vim /etc/rsyslog.conf
Uncommnet ModLoad and UDPServerRun, Here our Server will listen to Port 514 to collect the logs into syslog.
# Provides UDP syslog reception $ModLoad imudp $UDPServerRun 514
6. Next, we need to create a separate file ‘haproxy.conf‘ under ‘/etc/rsyslog.d/‘ directory to configure separate log files.
# vim /etc/rsyslog.d/haproxy.conf
Append following line to the newly create file.
local2.* /var/log/haproxy.log
Finally, restart the rsyslog service to update the new changes.
# service rsyslog restart
Hi Babin Lonston
Instead of apache can we use apache-tomcat on client machines for web application load balancing ??
What will be the www “A” record for my website is it haproxy IP ? or among 4 webserver IP. I am little bit confuse.
Nice tutorial, I have request for next tutorial about haproxy. how to force redirect when visitor try access http url redirect to https url, how to configuring to increase secure haproxy and how to tune that tool, the last.
how to upgrade/update to the latest version of haproxy in centos 7 which is that os use 1.5 veersion. thanks in advanced
Nice document, can you share best practices for HAProxy on Linux.
Nice document, can you share best practices for HAProxy on Linux.
how to install the latest version of haproxy? its possible?
how to access load balancer using url instead of ip ?
@Vincent,
Just you need to have a valid DNS entry.
And your hostname/IP A, AAA, PTR record should be created in DNS Server.
Thanks & Regards,
Bobin Lonston
Can we add another HA Proxy Server? To make it a distributed system?
this is not High availability setup.. LB doesnt have a failover
What happens if HAProxy server not ping at any situation?…
The article is really good to understand quite good stuff, I got. I have a question, while testing the load balancer if it is working fine why we are accessing loadbalancer IP address where we have created all index.html file in the 4 web server.
So the testing should be like when we access the web server by name we should be able to see from the log that which web server out of 4 it is using to server the request, isn’t it?
This is an amazing article. works for me and also helped in understanding how HAProxy works.
Thanks a ton !!
@Vikas,
Thanks for finding this article very useful and I am glad that you know now about HAProxy work flow..keep visiting Tecmint for more such quality articles..
It is a good article , but I have some question regarding the article
1- what is the difference between tcp and http mode in haproxy .. I am confused on this topic.
2- as you have mentioned this article as high availability with HA proxy .. this architecture does not seems high availability ,as per my view as you are using a single load balancer here , in a case if it gets down then architecture will fails as web servers are communicating with a single load balancers .. so it not seems a high availability architecture as my side .
Hi Babin ,
It is a good article , but I have some question regarding the article
1- what is the difference between tcp and http mode in haproxy .. I am confused on this topic.
2- as you have mentioned this article as high availability with HA proxy .. this architecture does not seems high availability ,as per my view as you are using a single load balancer here , in a case if it gets down then architecture will fails as web servers are communicating with a single load balancers .. so it not seems a high availability architecture as my side .
I am not criticizing , article is good , but I was expecting little more from you ..
thanks ,
Ashish
Hi Nice tutorial , just have one question
Where you are installing SSL, on load-balancer or webserver,
and if in load-balancer why you need to install apache2 ?
Please need this information
Great article!!! I like very much but I have questions…
Can I configure LBS in other section without redirect http to https? And…
I have different subdomain, certificates in the webserver…Now I use the heartbeat + ipvsadmin for the load balance and failover.
Can I configure haproxy for use all certificates my websites?
Example sites:
example.es
example.com
example.net
example.br
etc…
I put this configuration in haproxy.cfg and It works very well in http and https… could improve cache SSL connections to users or something?
———————-
# GLOBAL SETTINGS
global
log 127.0.0.1 local2
tune.ssl.default-dh-param 2048
chroot /var/lib/haproxy
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy.sock mode 600
#———————————————————————
defaults
mode http
log global
option dontlognull
option http-server-close
option redispatch
retries 3
timeout http-request 20
timeout queue 86400
timeout connect 86400
timeout client 86400
timeout server 86400
timeout http-keep-alive 30
timeout check 20
maxconn 50000
frontend LB
bind *:80
mode tcp
default_backend LB
frontend LBS
bind *:443
mode tcp
reqadd X-Forwarded-Proto:\ https
default_backend LBS
backend LB
mode tcp
balance roundrobin # Load balancing will work in round-robin process.
server web1 192.168.0.203:80 check # backend server.
server web2 192.168.0.204:80 check # backend server.
# server web4-srv 192.168.0.124:80 check backup # backup fail-over Server, If three of the above fails this will be activated.
backend LBS
mode tcp
stick-table type ip size 200k expire 30m
stick on src
default-server inter 1s
balance roundrobin # Load balancing will work in round-robin process.
#option ssl-hello-chk
server web1 192.168.0.203:443 check # backend server.
server web2 192.168.0.204:443 check # backend server.
# server web4-srv 192.168.0.124:80 check backup # backup fail-over Server, If three of the above fails this will be activated.
Very nice!
Thx
What if the Load Balancer itself becomes unavailable (power outage, connection outage, etc), isn’t all of the backend servers will also become unavailable as well? What would you recommend for such a situation?
You should have fallout server, so like there’s Active and passive Loadbalancer with same configuration. just install the apache heatbeat so once the Active is down the passive will take over.
I have been using HAProxy for several years for our intranet, and have just recently converted half to ssl, but not able to get the forwarding correct. Your tutorial here shows how to add ssl to the load balancer, but will the ssl need to be installed on the webservers as well? And will the certificate have to be the same on all servers?
No, Only we have to install SSL in LB.. There are two types of SSL configuration in HAProxy SSL termination or SSL pass through
How does data replicate between the web servers. We have users upload some picture which are stored in a specific location on the web server. If the users connection lands on different web server, those files are not found. How to manage this?
@Suman we using 7 Server’s for our deployment with NFS for files storage, No idea about your issue will review about your issue
You can also setup an SSHFS connection (which is basically just like a dropbox), mount it as a virtual hard drive on your servers, when a file is being placed on that drive, it will be sync-ed across all of the other drives on other servers.
do we have performances stat for HA proxy? what is the requirement if load test needs to be done using this load balancer?
@snehal HAProxy’s website show that they can support over a million connections a second with sessions. “Load” is determined by what your site is hosting and how many connections are being served. “Load” also is determined whether the site is dynamic or static. My use is a multi-master LAMP cluster. From an independent computer, I load a php script that writes to the databases as fast as it can execute. Each loop that it goes thru, has its own connection, insert, and disconnect. At 10,000 loops, the timestamps on the records between the first and the last are measured in milliseconds and are evenly split between the masters. Now when I access the site directly thru the load balanced virtual ip(which it is), the connection to the database is a one to one connection, meaning that it connects only to one server for the data, even thru a load balancer. This is the same for static pages. The load balancer spreads the load across multiple machines so that the SPF(Single Point of Failure) can be mitigated and that “load” is off set to other machines.