Tagged: Apache

Monitoring Apache Webserver and optimization

Although Most web servers share common challenges, we are considering that apache is utilized for the webserver.

If you scale the the hardware of a machine, it does not mean that webserver will utilize the resources accordingly. Mostly web server have a default value in terms of how much resoruce they can utilize from the machine. as, In Apache default limit of connection is 150.

We always need to modify the configuration as per the availability of resources, this is the main challenge in vertical scaling.

Monitoring Apache

To figure out the Root Cause of Load, first, you should check

  • In apache, there are two modules that are enabled or can be enabled,
    • Mod Info
    • Mod Status

Their information can be gathered from their handler.

  • <server-ip>/server-info
  • <server-ip>/server-status

Note: please check the module config, as, by default, these are only allowed from localhost. you can configure the setting and can access it.

WARNING: - These must NOT be publicly accessible as this is a security risk.

Server Status

<Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from my-ip-address-no-port
</Location>

This will provide current server status, server load, active request, and execution time.

TIP : To refresh this page automatically, put this at end of url
?refresh=15
this will auto load page at every 15 second

SrvChild Server number – generation
PIDOS process ID
AccNumber of accesses this connection / this child / this slot
MMode of operation
CPUCPU usage, number of seconds
SSSeconds since beginning of most recent request
ReqMilliseconds required to process most recent request
ConnKilobytes transferred this connection
ChildMegabytes transferred this child
SlotTotal megabytes transferred this slot

By Checking this page during load will provide you an idea about apache status. from above image, it is visible that the server is free to process requests.

If the number of requests is equal to the number of maximum allowed requests, then further request will get the 504 error.

Server Info

<Location /server-info>
    SetHandler server-info
    Order deny,allow
    Deny from all
    Allow from my-ip-address-no-port
</Location>

This contains configuration information about the webserver.

Server Settings
Server Version: Apache/2.4.29 (Ubuntu)
Server Built: 2021-09-28T11:01:16
Server loaded APR Version: 1.6.3
Compiled with APR Version: 1.6.3
Server loaded APU Version: 1.6.1
Compiled with APU Version: 1.6.1
Module Magic Number: 20120211:68
Hostname/port: buc.melimu.com:8001
Timeouts: connection: 300    keep-alive: 5
MPM Name: prefork
MPM Information: Max Daemons: 1500 Threaded: no Forked: yes
Server Architecture: 64-bit
Server Root: /etc/apache2
Config File: /etc/apache2/apache2.conf
Server Built With:   -D APR_HAS_SENDFILE  -D APR_HAS_MMAP  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)  -D APR_USE_SYSVSEM_SERIALIZE  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT  -D APR_HAS_OTHER_CHILD  -D AP_HAVE_RELIABLE_PIPED_LOGS  -D HTTPD_ROOT="/etc/apache2"  -D SUEXEC_BIN="/usr/lib/apache2/suexec"  -D DEFAULT_PIDLOG="/var/run/apache2.pid"  -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"  -D DEFAULT_ERRORLOG="logs/error_log"  -D AP_TYPES_CONFIG_FILE="mime.types"  -D SERVER_CONFIG_FILE="apache2.conf"

The most important information is

  • Time Out,
  • MPM Module [prefork / evnet / worker]
  • MPM information

apart from that, it contains a list of modules and their config.

Server Logs

As the config file is showing the log location,

there will be at least two files

  • error logs: in case web server related error
  • access logs: for all HTTP request

To check the data for the past, logs are most important thing to analyze. You can search for 504 request or you can lookup for errors if there are any.

Goaccess is a quite handy utility to analyze the access logs and figure out the pattern. It is available from command line as well report can be exported to HTML .

Following is the command that can be used to parse apache common format logs

goaccess -f /var/log/httpd/access_log --log-format="%h %l %u %^[ %d:%t %^]  %{ms}T \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""   --date-format=%d/%b/%Y --time-format=%T 

Once you have checked this information, then there may be two situations

  • Either Everything is fine
  • Or we need to optimize the configuration

Optimization

We may have enough hardware capabilities but apache is not using them or we just need to optimize the configuration basis on hardware allocation.

following are the steps that can be planned

  • Load only required module and disable the rest, this will reduce the process memory size
  • Apply Disk caching for static files
  • Modify the value of timeout or minimize that to reduce the overhead of long-running script, if somehow, your script contains forever while loops then apache does not suffer.
  • Modify the MPM configuration on basis of MPM module

Memory utilization

The following script will help you to figure out the average memory required for a process at your server so that you can calculate the total number of connections.

http://apache2buddy.pl/ it is a perl script that will provide you the optimization areas at your server.

To run this script, perl must available on the server. You can directly run below command or you can download the script and run that
https://raw.githubusercontent.com/richardforth/apache2buddy/master/apache2buddy.pl

curl -sL apachebuddy.pl | perl
  • Script will run as root
  • if there is an error like “The process running on port 80 is not Apache.” then please check the output of the command “/usr/sbin/apache2 -V “. The first line of the output should be Server version: Apache/2.4.29

This kind of output will be given. This will also provide you with general improvement points that can be done on the server.

if you do not want to run this script on the server, you can use the following command to view the apache memory utilization

In case of redhat base system

ps -ylC httpd | awk '{x += $8;y += 1} END {print "Apache Memory Usage (MB): "x/1024; print "Average Proccess Size (MB): "x/((y-1)*1024)}'

In case of Debian base system

ps -ylC apache2 | awk '{x += $8;y += 1} END {print "Apache Memory Usage (MB): "x/1024; print "Average Proccess Size (MB): "x/((y-1)*1024)}'

Connection Configuration

For the connection configuration, you must be knowing about the MPM module, accordingly, you can set the directives.

Prefork vs Worker vs Event

Prefork MPM launches multiple child processes. Each child process handle one connection at a time.Prefork is the default MPM used by Apache2 server. Prefork MPM always runs few minimum (MinSpareServers) defined processes as spare, so new requests do not need to wait for new process to start.

Worker MPM generates multiple child processes similar to prefork. Each child process runs many threads. Each thread handles one connection at a time.Worker MPM uses low memory in comparison to Prefork MPM.

Event MPM, Introduced in apache2.4. This MPM allows more requests to be served simultaneously by passing off some processing work to supporting threads. Using this MPM Apache tries to fix the ‘keep alive problem’ faced by other MPM. When a client completes the first request then the client can keep the connection open, and send further requests using the same socket, which reduces connection overload.

Warning : As PHP is not thread-safe, the common suggestion is to install Apache with the “prefork” MPM

Directives and their meaning

    ServerLimit               60 # Declares the maximum number of running Apache processes
    StartServers              5 # The number of processes to start initially when starting the Apache daemon
    MinSpareThreads           25 # The minimum number of idle threads available to handle request spikes
    MaxSpareThreads           75 # The maximum number of idle threads
    ThreadsPerChild           30 # How many threads can be created per server process
    MaxConnectionsPerChild  2000 # Defines the number of connections that a process can handle during its lifetime. This can be used to prevent possible Apache memory leaks (if set to 0 the lifetime is infinite)
  • First of all, whenever an apache is started, it will start 5 child processes which is determined by StartServers parameter.
  • Then each process will start 30 threads determined by ThreadsPerChild parameter so this means 5 process can service only 150 concurrent connections/clients i.e. 30×5=150.
  • Now if more concurrent users comes, then another child process will start, that can service another 30 users. But how many child processes can be started is controlled by ServerLimit parameter, this means that in the configuration above, I can have 60 child processes in total, with each child process can handle 30 thread, in total handling 60×30=1800 concurrent users.

One must define these directive on basis on MPM module and hardware resources, otherwise, this can floodup the server

Time Out Configuration

This directive also play a vital role in tuning the performance of the webserver.

  • MaxKeepAliveRequests
  • KeepAliveTimeout

Please visit following article to understand and tune these parameters

Apart from these

  • It is always a good idea to run Nginx in front and run apache in the background utilizing a reverse proxy, especially if you are using SSL , then offloading SSL at Nginx level will have more improvement.
  • Use Apache event MPM (and not the default Prefork or Worker)
  • Unless you are doing development work on the server, set ExtendedStatus Off and disable mod_info as well as mod_status.
  • Set DirectoryIndex correctly so as to avoid content-negotiation. Here’s an example from a production server:
DirectoryIndex index.php index.html index.htm
  • Consider reducing the value of TimeOut to between 30 to 60 (seconds).
  • For the Options directive, avoid Options Multiviews as this performs a directory scan. To reduce disk I/O further use
Options -Indexes FollowSymLinks
  • Compression reduces response times by reducing the size of the HTTP response. Install and enable mod_deflate – refer to the documentation or man pages and Add this code to the virtual server config file within the <directory> section for the root directory (or within the .htaccess file if AllowOverrides is On):
<ifModule mod_deflate.c>
  AddOutputFilterByType DEFLATE text/html text/plain text/xml text/x-js text/javascript text/css application/javascript
</ifmodule>

following are the good articles to go through

Read Next

Monitoring Apache Logs using Zabbix

Apart from monitoring server hardware and software key variable like CPU/Memory/Disk/Process, We also require monitoring of apache logs using Zabbix to monitor all from a…
Read More

Complete Guide to Debug LOAD Issue in LAMP Stack [Linux/Apache/Mysql/PHP]

Problem Statement

There is php, apache , mysql based application, the stack is hosted on Linux.

This is a single server application, where PHP and apache are on an ubuntu-based Linux server and MySQL is on another server.

let’s suppose the application becomes unresponsive after a certain load. Now you want to RCA and optimize the bottleneck of the problem, the following will provide you a complete guide to debug LOAD issue in LAMP stack

LAMP Stack Component

To figure out the load issue, first, we need to figure out the dependent items of the stack, from where the problem may arise. We divide this into two parts, hardware, and software

Hardware

  1. Linux server hosting apache2
  2. Linux server hosting MySQL

Software

  1. Apache/Nginx webserver
  2. PHP processing [running as module/fpm/fastcgi]
  3. Mysql server

Root Cause Analysis [RCA]

Flow Of Analysis

Considering the situation where the site is becoming unresponsive,

the very first thing that is coming to consideration is, Hardware limitation

Monitor Hardware Component

Q1:- are the resources enough to handle so many requests?

To get the answer to this question, we analyze the system state. considering that we are on Linux OS [ubuntu], however, most *nix are same.

check for web server machine

  • If possible, put monitoring on CPU utilization. You can use htop/top command to check the process and load
  • Put monitoring on ram utilization. htop/top command can be used to check the log
  • In case of the machine become unresponsive, check for System logs [syslogs], mostly under var/log/syslog location
    • tail -f /var/log/syslog
  • Check the stale process using
    • strace or pmap <pid>
    • ls /proc/<pid>
    • cat /proc/<pid>/status
    • cat /proc/cpuinfo
    • cat /proc/meminfo
    • cat /proc/zoneinfo
    • cat /proc/mounts

htop/top is only helpful if you monitor during load as they shows current state of the system otherwise put the monitoring by using any tool.

There are various tool like cloud instances also provide resource monitoring as Cloudwatch in AWS, or, one can setup open source tool like nagios, zabbix, opennms or there are handy utilitites like, glance, atop, etc https://www.cyberciti.biz/tips/top-linux-monitoring-tools.html

There are 2 main factors to analyze

Apart from these 2, one can also check for another two factors. As most of the cloud systems provide a good bandwidth allocation and that can also be checked from Network monitoring and Most systems are equipped with SSD, so these are not the major factory but can play a vital role if your application highly depends on that, like you are using firebase data to read and write which need higher I/O or delivering large media file

check for the database server machine

If the database server is deployed on another machine, then, we need to follow the same steps as we covered in the earlier section to figure if the database server is enough or not.

In case, you are using AWS RDS service for the database, you can use AWS cloudwatch or RDS monitoring tool with CPU matrix or connection matrix to figure out the root cause.


Most of the cloud provider provides these type of system/instance/machine monitoring tool, where we check the hardware component dependency and bottleneck areas.

Now there may be two situations,

  1. Either server resources are NOT enough,
  2. Or server resources are available even at peak load.

Server resources are NOT enough

ACTION

* Scale Up / Scale Out the resources [Vertical Scaling / Horizontal Scaling]

* Validate the request are legtimate, means there is no bocus request flooding your server to keep busy

If hardware is the bottleneck, we need to upgrade the resource either by following Vertical Scaling or Horizontal Scaling

Server resources are enough

If we are at this point, where resources are available like, servers are healthy and under the threshold limit, it means that hardware is not the bottleneck, we should check for the software level bottleneck.

Monitor Software Component

Q2: if the server resources are enough, then what might be the root cause of the unresponsive web app?

In a typical PHP website, the software stack includes.

Monitoring Apache Logs using Zabbix

Apart from monitoring server hardware and software key variable like CPU/Memory/Disk/Process, We also require monitoring of apache logs using Zabbix to monitor all from a…
Read More