Email Search Indexes

Axigen Documentation

This article is about Axigen's new email search indexes, available starting with Axigen X3 Update 2 (10.3.2)

The new search indexes used by Axigen starting with version 10.3.2 are implemented using a set of sqlite FTS databases that are stored in the filesystem the Axigen domain is hosted on, but outside the Axigen domain storage managed from CLI / WebAdmin. The root folder for the new search databases is a folder named following the convention <Axigen working directory>/<domain_name>/email.idx.

The location of <Axigen working directory> depends on your operating system and configuration. By default, the search indexes paths would be: 

  • On Linux: /var/opt/axigen/domains/<domain_name>/email.idx

  • On Windows: C:\Program Files\Axigen Mail Server\domains\<domain_name>\email.idx

Database Location

Two CLI commands in the root context allow an administrator to:

  • find the location of the database that contains the search index, given an account / folder pair:

  • find the account / folder pair whose search index it contains, given the location of a database that contains a search index:

The Indexer Job

New search indexes are computed by a job that is triggered when Axigen loads the account whose indexes are to be computed.

The first time the job runs it will:

  • get a list of the user's folders

  • for each folder in the list

    • delete the canonic form for all the messages from the folder

    • delete the (old) search index for the folder

  • log a summary in the JOBLOG logger:

  • for each folder in the list

    • create the sqlite database

    • get a list of changes based on the empty synchronization key

    • for each change from the change set

      • process the change in the sqlite database

  • log a summary in the JOBLOG logger: 

On subsequent runs, the job will log a summary in the JOBLOG logger:

What Gets Indexed

The new search index indexes the text content from headers, text/plain, and text/html MIME parts.

The text content is split into words; in the index, only prefix based searches can be performed (i.e. "adventure" is found when searching for "advent" but not when searching for "dvent"); before words are written into the index, they are lower cased and diacritical marks are folded (meaning über, Über and uber all result in the same token).

Text contained in attachments (no matter the attachment content type) is not indexed.

What the Index Contains

The new search index uses a set of tables that define which search tokens belong to which emails.

As such headers, text MIME parts and meta-information related to an account such as folder IDs, numbers of messages, synchronization tokens, etc. are stored in the indexes in the filesystem and are readable (in terms of Operating System permissions) by the axigen user and group.

What to Expect When the Index Is Built for the First Time

When the new search implementation runs for a first time, an indexer job is started each time an Axigen account is loaded by the server.

Axigen will run at most ten search indexing jobs concurrently. This means that the new search indexes will be built 10 accounts at a time. Depending on the number of accounts the Axigen server is hosting and also depending on how much storage is used by those accounts, it may take a significant amount of time for the new indexes to be completely ready for use.

Since Axigen X3 Update 2 (10.3.2) can only search in the new indexes, while the new indexes are being built search requests are answered with empty answers. As soon as an account's folder has been indexed for the first time, searches start working for that folder.

Disk Space Requirements

The new search implementation is more efficient in terms of disk usage when compared with the old implementation. This means an axigen server's total disk usage will decrease when comparing versions 10.3.1.x vs 10.3.2.x.

When 10.3.2.x starts for the first time and indexes are being built, disk space will temporarily increase as old indexes are not deleted synchronously (they are added to a delete queue that is slowly emptied as time passes).

You should plan for an additional 20-30% disk space when running 10.3.2.x for the first time (so if your Axigen disk usage is 100GB you should have an additional 20GB-30GB available before starting the upgrade to 10.3.2).

CLI Commands

The CLI commands that manage search indexes have changed as part of the new implementation. The new commands are described below.

Status of Indexes for a Given Account

In the CLI account context (the context that is opened by the update account command), a number of CLI commands show information about the search indexes:

CLI command

Description

The command displays:

  1. the number of folders that can be indexed (folders of type email)

  2. the number of folders that contain data (at least one message)

  3. the total number of messages from indexable folders

  4. the number of indexed messages

  5. the size taken by the index

  6. the current status of the search index

When using the "folder" argument, the command can be used to display the status for an individual folder.

When using the "allfolders" argument, the command displays search index status for all the indexed folders.

Status of Indexing Jobs at Server Level

In the processing context (accessible by using config server followed by config processing):

CLI command

Description

Shows the status of the running mail search index jobs.