Domain Storage unusual Size in File Level

@indreias

Please find logs of CLI but unfortunately I did a mistake and only informational messages only not protocol communication. Maybe it will help still but if needed I can rerun the Compact again tomorrow.

CLI.txt (144.5 KB)

Operating system: Linux (Centos 6.10) / x64
Current Server version: 10.2.2.96

Regards,
Jay

Hello Jay,

Unfortunatelly the file do not contain any CLI log lines (but only JOBLOG ones). The CLI log lines have CLI:<session_id> and in order to have all details you have to set the CLI service log level to Protocol Communication before connecting to CLI for executing the compact operation.

Sorry for this.

BR,
Ioan

PS: just to let you know that the latest 10.2.2 patch is 10.2.3.8 (details available here)

@indreias

Please find correct CLI Protocol Logs for your reference.

CLI.txt (2.6 MB)

Also for the new update, Yes, I understand that there is new update for some fixes but we keep this one on hold as we expect last quarter last year that the new X4 will come out and we will shift this server to a new servers / machines.

Hope this log will be helpful to trouble shoot my issue, I cannot run this server properly, under stress and monitoring as my free disk space is under 2%.

Regards,
Jay

Hello Jay,

I’ve checked the CLI logs and nothing strange found there.

Could you please also share the output from the following CLI command:

<domain#>  show disposablemetadatainformation

On the other side I failed to understand is why you have configured more than 1TB for your messages (10 message storages x 128 files x 1GB) if you do not have enough space available.

Best regards,
Ioan

@indreias

Please see below details:

<domain#> show disposablemetadatainformation

  • Current size: 8848454 Kb
  • Maximum available size: 301989888 Kb
  • Percent from available size: 2 %
  • Maximum storage capacity: 1207959552 Kb
  • Percent from storage capacity: 0 %

For the question why more than 1TB containers.
This is due to the troubleshooting method I did, trying to see what is going on. And trying to force axigen to expand and create new containers hoping when It makes compact it will recreate some hsf to other container.

My original containers are 7 only. Created 8 and 9 later on when I encounter this issue.

Me neither cannot find issue why Axigen acted like this.

Regards,
Jay

@indreias

Hi there

Do you have any idea / suggestion how to fix this issue?

I am still waiting for some help.

Regards,
Jay

Hello everyone,

Any suggestion will be appreciated. All help will do.

Regards,
Jay

Hi,

A few years ago (2018) I also had this problem, No recommendations helped (e.g. commands FINDINVALIDMSG purge, SHOW DisposableMetadataInformation, COMPACT All forced, etc.),
I described this problem in an earlier version of this forum, but now I can’t find it on the community Axigen Community Forum).
Generally, it helped me to perform the restore procedure. The results of this procedure was to reduce storage space to the total size of all accounts, groups, publicFolder, etc. in this domain. The results are as expected.
At the beginning, the all message storage size were 84,3 GB.
FTP Backup Size were 18,7 GB
Total storage space after recovery are 18,6 GB.
The total number of messages, accounts, filters, group, etc. was exactly the same as before RECOVERY procedure.

My RESTORE procedure what I was using was this:

  1. Stop/block all traffic (sending, receiving, etc.) in domain8.
  2. Make a backup domain8 (FTP Backup).
  3. Delete domain8 (remove all space on FS).
  4. Create Domain8 (option Add a new domain).
  5. Performing a recovery (from the copy made in p2.), details in (Axigen Mail Server - How to restore a domain using FileZilla).
  6. Check new Internal IDs for all accounts.
  7. Recreate user filters by changing file name /var/opt/axigen/filters/DomainStorageId-Internal ID.wmfilter
  8. Start all traffic (sending, receiving, etc.) in domain8.

Our Axigen Mail-Server version was 8.0.2 (linux/x86),

Best regards,
bcteam

@bcteam

Thanks btw, Yes this will work but for me this is not a solution but only a work around. In which this needs to be address by Axigen. And if you encounter similar issue from before then this is not a unique issue.

I need a permanent solution incase this will happen again. I do have much bigger axigen deployment and we cannot make a schedule downtime for this scenarios.

Regards,
Jay

Hello @Jay ,

Reviewing this thread I didn’t find a refference if you already executed the CLI SCAN ALL command (details here) and my advice is to start a screen session from which you are executing SCAN ALL softPurge after you have entered into the coresponding domain context.

If possible please share the CLI logs coresponding to that command if you didn’t recover back some space.

Just as a temporary solution: are you able to add a new disk and move there some message locations (like messages[7-9]) so you will not be pressed by having not enough space?

HTH,
Ioan

@indreias

I will do as per your advice. I will put temporary disk until we find the culprit of this issue.

Thanks again for all the assistance.

I will execute the command and keep you posted.

Regards,
Jay

@indreias

There is good news and there is bad news.

Found on ALL SCAN the LEAKS which occupy 500+ GB.

I tried to run ALL SCAN PURGE but unfortunately did not remove the leaks. Please see attached CLI logs for reference.

CLI_SCAN_ALL_PURGE.txt (9.0 KB)

Regards,
Jay

@indreias

Update:

Good news, I reacquire all my disk space now.

After executing SCAN ALL PURGE executing did not do anything. Then I just tried to do COMPACT ALL FORCED after and it did the job.

Before closing this TOPIC.

How this happen and how to prevent it in future?

Regards,
Jay

Hello @Jay

Good to know that you managed to solve your problem. In order to get the current situation you have to run first SCAN ALL clearCache and after this a new SCAN ALL.

There is nothing that I could suggest to do in the future to prevent this things as these are not usual situations (maybe configure a weekly cron job based on SCAN ALL CLEARCACHE + SCAN ALL CLI commands that may allert you in case there are some space consumed by leaks so you may apply the purge command).

What you have to be sure is to not configure storages that may consume / overpass your entire disk space.

BR,
Ioan

@indreias

Thank Ioan for all the assistance.

Yes, I readjusted all my containers now to the proper value of my disk space.

Regards,
Jay

Hello @Jay,

I really hope you have carefully made mentioned changes as usually readjusting the storages to lower values is usually not supported.

In case you have double checked that the maxFileSize value configured into the storages are bigger than the size of any storage files from that storage unit after the compact was executed than you may have dodged the bullet.

Anyway you have earned the new bravery badge that I have just created.

BR,
Ioan

1 Like

@indreias

Yes I did, before changing the sizes of my container I double check each one and make sure the it will not be lesser than the current container size.

Thanks again for the feedback.

Regards,
Jay

1 Like

@indreias

Last question in which I need to ask you this which very critical.

How can I assure with a lot of confidence that I did not lose any critical data’s / emails when I cleared the 500+ GB LEAKS?

Because right now I base all decisions to Axigen algorithm to identify which is good and which is LEAKS.

Why I am asking this, as you can see on the start of my thread.

My capacity on WebAdmin base on per account size is : 214 GB
My File System size is 665 - 700GB
Leaks detected is: 500+ GB

Then when I finished removing the LEAKS my File System Capacity is only: 135GB
What happen to the difference between the actual size on WebAdmin (214GB) vs The size after removing Leaks (135GB)

In which that is about 70GB difference.

Hoping to get clarification on this matter.

Regards,
Jay

Hello @Jay

The sum of account’s mailbox size is not necessarily equal to the disk space as Axigen is internally storing one message per multiple recipients (more info here).

From the scan all KB refferend in a previous reply we know that leaks are messages found in the storages, which are not referred from any mailbox so definitelly there will be no mechanism (but repair accounts ...) to scan them and try to retrieve any usefull data from there.

As always you have to backup your data in case something goes wrong and here I’m reffering to backup via FUSE or FTP as in such way you have access to each internal Axigen objects (like messages, contacts, etc) that may be recovered if needed (like restoring a particular folder and its content that one of your users wrongly permanently deleted).

Note: in case you will restore from a previous FUSE or FTP backup the message deduplication could not be recreated so in case of a full domain restore (not recomended but possible) the disk space will be more or less in sync with the sum of mailbox sizes.

HTH,
Ioan

@indreias

Thanks appreciate the detailed answer. And thank you for all the assistance.

Regards,
Jay