Monitoring Apache Webserver and optimization

goaccess-monitor-apache-logs

Although Most web servers share common challenges, we are considering that apache is utilized for the webserver.

If you scale the the hardware of a machine, it does not mean that webserver will utilize the resources accordingly. Mostly web server have a default value in terms of how much resoruce they can utilize from the machine. as, In Apache default limit of connection is 150.

We always need to modify the configuration as per the availability of resources, this is the main challenge in vertical scaling.

Monitoring Apache

To figure out the Root Cause of Load, first, you should check

  • In apache, there are two modules that are enabled or can be enabled,
    • Mod Info
    • Mod Status

Their information can be gathered from their handler.

  • <server-ip>/server-info
  • <server-ip>/server-status

Note: please check the module config, as, by default, these are only allowed from localhost. you can configure the setting and can access it.

WARNING: - These must NOT be publicly accessible as this is a security risk.

Server Status

<Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from my-ip-address-no-port
</Location>

This will provide current server status, server load, active request, and execution time.

TIP : To refresh this page automatically, put this at end of url
?refresh=15
this will auto load page at every 15 second

SrvChild Server number – generation
PIDOS process ID
AccNumber of accesses this connection / this child / this slot
MMode of operation
CPUCPU usage, number of seconds
SSSeconds since beginning of most recent request
ReqMilliseconds required to process most recent request
ConnKilobytes transferred this connection
ChildMegabytes transferred this child
SlotTotal megabytes transferred this slot

By Checking this page during load will provide you an idea about apache status. from above image, it is visible that the server is free to process requests.

If the number of requests is equal to the number of maximum allowed requests, then further request will get the 504 error.

Server Info

<Location /server-info>
    SetHandler server-info
    Order deny,allow
    Deny from all
    Allow from my-ip-address-no-port
</Location>

This contains configuration information about the webserver.

Server Settings
Server Version: Apache/2.4.29 (Ubuntu)
Server Built: 2021-09-28T11:01:16
Server loaded APR Version: 1.6.3
Compiled with APR Version: 1.6.3
Server loaded APU Version: 1.6.1
Compiled with APU Version: 1.6.1
Module Magic Number: 20120211:68
Hostname/port: buc.melimu.com:8001
Timeouts: connection: 300    keep-alive: 5
MPM Name: prefork
MPM Information: Max Daemons: 1500 Threaded: no Forked: yes
Server Architecture: 64-bit
Server Root: /etc/apache2
Config File: /etc/apache2/apache2.conf
Server Built With:   -D APR_HAS_SENDFILE  -D APR_HAS_MMAP  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)  -D APR_USE_SYSVSEM_SERIALIZE  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT  -D APR_HAS_OTHER_CHILD  -D AP_HAVE_RELIABLE_PIPED_LOGS  -D HTTPD_ROOT="/etc/apache2"  -D SUEXEC_BIN="/usr/lib/apache2/suexec"  -D DEFAULT_PIDLOG="/var/run/apache2.pid"  -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"  -D DEFAULT_ERRORLOG="logs/error_log"  -D AP_TYPES_CONFIG_FILE="mime.types"  -D SERVER_CONFIG_FILE="apache2.conf"

The most important information is

  • Time Out,
  • MPM Module [prefork / evnet / worker]
  • MPM information

apart from that, it contains a list of modules and their config.

Server Logs

As the config file is showing the log location,

there will be at least two files

  • error logs: in case web server related error
  • access logs: for all HTTP request

To check the data for the past, logs are most important thing to analyze. You can search for 504 request or you can lookup for errors if there are any.

Goaccess is a quite handy utility to analyze the access logs and figure out the pattern. It is available from command line as well report can be exported to HTML .

Following is the command that can be used to parse apache common format logs

goaccess -f /var/log/httpd/access_log --log-format="%h %l %u %^[ %d:%t %^]  %{ms}T \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""   --date-format=%d/%b/%Y --time-format=%T 

Once you have checked this information, then there may be two situations

  • Either Everything is fine
  • Or we need to optimize the configuration

Optimization

We may have enough hardware capabilities but apache is not using them or we just need to optimize the configuration basis on hardware allocation.

following are the steps that can be planned

  • Load only required module and disable the rest, this will reduce the process memory size
  • Apply Disk caching for static files
  • Modify the value of timeout or minimize that to reduce the overhead of long-running script, if somehow, your script contains forever while loops then apache does not suffer.
  • Modify the MPM configuration on basis of MPM module

Memory utilization

The following script will help you to figure out the average memory required for a process at your server so that you can calculate the total number of connections.

http://apache2buddy.pl/ it is a perl script that will provide you the optimization areas at your server.

To run this script, perl must available on the server. You can directly run below command or you can download the script and run that
https://raw.githubusercontent.com/richardforth/apache2buddy/master/apache2buddy.pl

curl -sL apachebuddy.pl | perl
  • Script will run as root
  • if there is an error like “The process running on port 80 is not Apache.” then please check the output of the command “/usr/sbin/apache2 -V “. The first line of the output should be Server version: Apache/2.4.29

This kind of output will be given. This will also provide you with general improvement points that can be done on the server.

if you do not want to run this script on the server, you can use the following command to view the apache memory utilization

In case of redhat base system

ps -ylC httpd | awk '{x += $8;y += 1} END {print "Apache Memory Usage (MB): "x/1024; print "Average Proccess Size (MB): "x/((y-1)*1024)}'

In case of Debian base system

ps -ylC apache2 | awk '{x += $8;y += 1} END {print "Apache Memory Usage (MB): "x/1024; print "Average Proccess Size (MB): "x/((y-1)*1024)}'

Connection Configuration

For the connection configuration, you must be knowing about the MPM module, accordingly, you can set the directives.

Prefork vs Worker vs Event

Prefork MPM launches multiple child processes. Each child process handle one connection at a time.Prefork is the default MPM used by Apache2 server. Prefork MPM always runs few minimum (MinSpareServers) defined processes as spare, so new requests do not need to wait for new process to start.

Worker MPM generates multiple child processes similar to prefork. Each child process runs many threads. Each thread handles one connection at a time.Worker MPM uses low memory in comparison to Prefork MPM.

Event MPM, Introduced in apache2.4. This MPM allows more requests to be served simultaneously by passing off some processing work to supporting threads. Using this MPM Apache tries to fix the ‘keep alive problem’ faced by other MPM. When a client completes the first request then the client can keep the connection open, and send further requests using the same socket, which reduces connection overload.

Warning : As PHP is not thread-safe, the common suggestion is to install Apache with the “prefork” MPM

Directives and their meaning

    ServerLimit               60 # Declares the maximum number of running Apache processes
    StartServers              5 # The number of processes to start initially when starting the Apache daemon
    MinSpareThreads           25 # The minimum number of idle threads available to handle request spikes
    MaxSpareThreads           75 # The maximum number of idle threads
    ThreadsPerChild           30 # How many threads can be created per server process
    MaxConnectionsPerChild  2000 # Defines the number of connections that a process can handle during its lifetime. This can be used to prevent possible Apache memory leaks (if set to 0 the lifetime is infinite)
  • First of all, whenever an apache is started, it will start 5 child processes which is determined by StartServers parameter.
  • Then each process will start 30 threads determined by ThreadsPerChild parameter so this means 5 process can service only 150 concurrent connections/clients i.e. 30×5=150.
  • Now if more concurrent users comes, then another child process will start, that can service another 30 users. But how many child processes can be started is controlled by ServerLimit parameter, this means that in the configuration above, I can have 60 child processes in total, with each child process can handle 30 thread, in total handling 60×30=1800 concurrent users.

One must define these directive on basis on MPM module and hardware resources, otherwise, this can floodup the server

Time Out Configuration

This directive also play a vital role in tuning the performance of the webserver.

  • MaxKeepAliveRequests
  • KeepAliveTimeout

Please visit following article to understand and tune these parameters

Apart from these

  • It is always a good idea to run Nginx in front and run apache in the background utilizing a reverse proxy, especially if you are using SSL , then offloading SSL at Nginx level will have more improvement.
  • Use Apache event MPM (and not the default Prefork or Worker)
  • Unless you are doing development work on the server, set ExtendedStatus Off and disable mod_info as well as mod_status.
  • Set DirectoryIndex correctly so as to avoid content-negotiation. Here’s an example from a production server:
DirectoryIndex index.php index.html index.htm
  • Consider reducing the value of TimeOut to between 30 to 60 (seconds).
  • For the Options directive, avoid Options Multiviews as this performs a directory scan. To reduce disk I/O further use
Options -Indexes FollowSymLinks
  • Compression reduces response times by reducing the size of the HTTP response. Install and enable mod_deflate – refer to the documentation or man pages and Add this code to the virtual server config file within the <directory> section for the root directory (or within the .htaccess file if AllowOverrides is On):
<ifModule mod_deflate.c>
  AddOutputFilterByType DEFLATE text/html text/plain text/xml text/x-js text/javascript text/css application/javascript
</ifmodule>

following are the good articles to go through

Read Next

3 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *