Tagged: home

Setup Solr with Moodle for search inside files content

Following is the step by step guide to setup Solr with Moodle for search inside files content. This will need setting up of SOLR, setting up php-solr extension and solr core configuration.

Moodle has an option for global search, which provide an option to search among course, activities and other areas. This is a unified search, which extrac the data from database , basis on user serach query and context.

Context means : Search inside only enrolled courses or All the courses.

We will divide this topic into following part,

What is the need for Solr

Out of many benefits like,

  • fast searching
  • ranking
  • search stats

the most benefit is, searching inside the files.

Majorly, files are uploaded in the system, as pdf, doc, txt, ppt etc whether as an activity or assignment submission etc.

these files are not much relevan if we are not able to search within the file content as an 100 pages pdf will be irrelevant if not comes with a saearch query,

Here, SOLR do its job to provide the result by searching inside the file content at a speed.

There are alternatives to SOLR as elasticsearch, but, SOLR is avialable and integrated directly in moodle.

more about solr, https://solr.apache.org/features.html

What is solr

the next section is about, what is solr ?

SOLR is an open-source search engine that provides a REST API Interface to ingest the content and query against that content and the output is in JSON.

Solr is a standalone enterprise search server with a REST-like API. You put documents in it (called “indexing”) via JSON, XML, CSV or binary over HTTP. You query it via HTTP GET and receive JSON, XML, CSV or binary results.

more about solr , https://solr.apache.org/

How to setup solr ?

SOLR can be setup on any linux or windows or mac server.

It will run independently and can be run on the same server at which the moodle runs or on any other server.

there are many tutorial to install solr on specific platforms


Remember this,

  • During setup on Linux, you will create a solr user and try to run solr by that user only, you can switch the user from sudo su command and then do the solr-related operation for sake of permission issues in Linux.
  • Once you have set up the solr and able to access the solr home page from the web, you need to create a collection. The collection is like a logical area, where ingested data and operations will perform
su - solr -c "/opt/solr/bin/solr create -c moodle 
  • Apart from the solar, you will require the php solr extension at the moodle server. that is not available directly for installation. You can install it through pear or pecl. and depend upon the version , you may require to build the extension from zip.
  • following command can help you in centos based system
yum install php-pear php-pecl  php-devel curl-devel zlib-devel pcre-devel gcc ibxml2-devel
pecl install solr

How to setup it with Moodle

Now, you have

  • setup the solr
  • create a basic collection name as moodle
  • setup the php solr extension

it comes to the integration part,

  • login through site admin in moodle
  • enable global serach under Site administration > advance features
  • Go on Administration > Plugins > Search > Manage global search
  • Select search engine as Solr
  • Configure the Solr under Administration > Plugins > Search > Solr
  • Enable File Indexing option, and setup upper size, if 0 then unlimited.

if dependencies are met, it will show the screen like this

  • Once the first three are yes, Click on Index data. this will ingest the data into solr and will show a screen like this.
  • You can enable the global search block and can try with query.

If you are not getting the result, then try to query the in solr directly, like below

Still, the file content would not be searchable.

Setting up file content serachable

still, the file content is not searchable if you have created a core from the default configuration. To make it searchable, we need to apply some changes at solr side.

Add following at line number 70 around

<lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib" regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-cell-\d.*\.jar" />

  <lib dir="${solr.install.dir:../../../..}/contrib/langid/lib/" regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-langid-\d.*\.jar" />

  <lib dir="${solr.install.dir:../../../..}/contrib/velocity/lib" regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-velocity-\d.*\.jar" />

Add following at line number 850 around

            <requestHandler name="/update/extract"
                  class="solr.extraction.ExtractingRequestHandler" >
    <lst name="defaults">
      <str name="lowernames">true</str>
      <str name="uprefix">ignored_</str>

      <!-- capture link hrefs but ignore div attributes -->
      <str name="captureAttr">true</str>
      <str name="fmap.a">links</str>
      <str name="fmap.div">ignored_</str>
  • restart the solr service
  • re-index the data from moodle form Index Data Page
  • Now you can query from solr or from global search , and if all goes well, it will display the result

Moodle Search result

Solr Search Result

Monitoring Apache Logs using Zabbix

Apart from monitoring server hardware and software key variable like CPU/Memory/Disk/Process, We also require monitoring of apache logs using Zabbix to monitor all from a single monitoring platform

Monitoring Apache Logs

If we are talking about monitoring, it covers the aspect

  • Either we want to get some logs like 404 requests or 500 requests
  • Or we want to get the number

Add/Create an Item in Zabbix

For this, You can create custom items

  • Login on Zabbix Server Frontend URL
  • Configuration > Template > Create Template
  • After creating a template, click on the newly created template, and under that click on Items
  • Click on create Item
  • Most of the elements are self-explanatory like Name, type, update interval, History period, Application group [optional]
  • The Key contains multiple methods, for now, we are using these two
    • logrt : used for getting logs with regular expression as here we trying to get logs for only 404 requests searching all files that end access_log
    • logrt.count : used for getting count with regular expression
logrt[/var/log/apache2/.*access.log, 404 ,,,skip] 
  • .*access.log is a regex pattern
  • second parameter is also a regex : <space>404<space>, searching status code
    • – – [22/Feb/2022:05:01:43 +0000] “GET /apple-touch-icon-precomposed.png HTTP/1.1” 404 230
  • You can add more items, it is just required log file path, aggregator method, and regex
  • And you will get this Data under the Host graph along with other data

Check if Zabbix-Agent can access log file

Make sure, the Zabbix agent is able to read the log file, this can be ensured by executing the following command

su zabbix -s /bin/bash -c "tail /var/log/httpd/access_log"

if you are getting permission denied, please provide the permission to Zabbix user to read this directory

In case you are not getting data, Please cross-check for zabbix.log on the host, if there is any issue.

tail -f /var/log/zabbix/zabbix_agentd.log

Monitoring Web Scenario

Along with the apache logs, you can also any specific URL on a time interval validating the response and raising the alarm. It is like uptime robot where you add your URL and it pings to the URL and trigger an email in case URL is down.

  • Create a Web Scenario
  • Add basic detail like name, agent[zabbix/chorme], proxy if any
  • Add URL, params, and response validation under steps