Skip to content

Conversation

kzurell
Copy link

@kzurell kzurell commented Jul 21, 2025

This generates weekly web usage statistics and emails the report to the administrator. It installs and configures GoAccess, which is standard on Ubuntu.

I wanted web stats for my MIAB static sites, but without using a separate web property or service. Not looking for minute-by-minute analytics, only a casual, general sense of usage. Another contributor proposed adding AWStats to MIAB some time ago, which would mean a separate process and web interface.

This patch installs GoAccess, configures it for 7 days of database retention, runs it on logrotate and to generate the HTML stats report on Mondays along with the other reports. It emails the report to the site administrator as an attachment, a self-contained HTML page with lots of graphs and numbers.

Breaking changes: this patch alters the Nginx log file to include virtual host name (and port, not used). I have altered the Munin jail which watches access.log, and @JoshData's setup tracking script (not fully tested). It changes the stock Nginx log_format in what I read is a correct way (turn off globally, define w/ custom in each server), would like confirmation.

Left to do: the GoAccess configuration is largely stock, including dashboard layout. It can do geographic analysis, which is very interesting but requires extra configuration & files which may or may not be straightforward. The /mail/ etc. URLs appear in the stats, which might not be desired. Not sure what happens if GoAccess runs on logrotate and daily_tasks at the same time. GoAccess can't be made completely --quiet, need to verify this won't cause problems.

  • Why GoAccess?
    It can run completely batch. The reports are presentable (& self-contained, no remote dependencies per browser console). Development is ongoing. There are others, the same approach could easily work for a different batch processor.
  • Why not use GoAccess' live HTML statistics page?
    GoAccess can serve an HTML page and use WebSockets to update the dashboard. This means a separate web presence, which could be done but adds complexity.
    If the administrator wants live access they can ssh in, run goaccess and use the ncurses interface, which is also very slick, probably enough to diagnose in-the-moment issues.
  • Why not use separate virtual host access logs?
    Could do, but that's a bigger discussion.
  • Why not make the report HTML mail instead of an attachment?
    The static report that GoAccess generates is largely script and JSON, RoundCube strips out the scripts and metas.

It is far along enough to fail usefully. It works in Vagrant for setup and normal running. Did my best idiomatic python and bash, but, y'know.... Would like to see if there's any appetite before I go any further. --Thanks, K.

@lemanschik
Copy link

@JoshData LGTM+1

Of course, here is the review of the pull request in English.

Review of the Mail-in-a-Box Pull Request: Web Statistics Using GoAccess

Summary of the Pull Request

This pull request introduces a significant new feature to Mail-in-a-Box: the generation and weekly delivery of web server statistics to the administrator via email. This is achieved by integrating GoAccess, a real-time web log analyzer.

The most important changes include:

  • Installation and Configuration of GoAccess: GoAccess is added to the system dependencies and configured to maintain a persistent database for web statistics.
  • New Nginx Log Format: The Nginx log format is changed to VCOMBINED. This format includes the virtual host, which allows GoAccess to collect separate statistics for each domain hosted on the Mail-in-a-Box instance.
  • Weekly Report: A new cron job is created that generates an HTML report every Monday with the web statistics from the previous week.
  • Email Dispatch with Attachments: A new Python script (email_administrator_attachment.py) is added to send the generated HTML report as an email attachment to the administrator.
  • Adaptation of Existing Tools: Existing tools that parse Nginx logs, such as fail2ban and a script for analyzing bootstrap accesses, are updated to be compatible with the new log format.

Detailed Analysis of the Changes

  • setup/web.sh: This script handles the installation of GoAccess and its configuration. It sets up a persistent database to store data across log rotations and configures a logrotate pre-rotate hook. This ensures that all log entries are processed before the log files are archived.
  • conf/nginx-top.conf and conf/nginx.conf: The Nginx configuration is modified to use a new log format called vcombined. This is crucial for GoAccess to differentiate traffic for various domains.
  • management/daily_tasks.sh: The weekly execution of GoAccess is implemented here. The script generates an HTML report and pipes it to the new email script.
  • management/email_administrator_attachment.py: This new script is a clean implementation for sending emails with attachments from the command line. It reads the report from standard input and sends it as a MIME-encoded email.
  • conf/fail2ban/filter.d/miab-munin.conf and tools/parse-nginx-log-bootstrap-accesses.py: The adjustments in these files are necessary to ensure compatibility with the new Nginx log format. This demonstrates that the author has considered the far-reaching effects of the log format change.

Evaluation of the Pull Request

Overall, this is a high-quality and well-thought-out pull request.

Strengths:

  • Useful Functionality: Web statistics are a valuable addition for any server administrator.
  • Robust Implementation: The use of a persistent database for GoAccess and the integration with logrotate ensure reliable and complete statistics.
  • Clean Code: The changes are well-structured and integrate seamlessly into the existing architecture of Mail-in-a-Box. The new Python script for sending emails is modular and reusable.
  • Completeness: The author has not only implemented the new feature but has also made the necessary adjustments to other parts of the system affected by the changed log format.

Potential Areas for Improvement:

  • Error Handling: In management/daily_tasks.sh, the error handling could be improved. If goaccess fails, no notification is currently sent to the administrator. It might be beneficial to send an email notification in case of an error.
  • Resource Usage: GoAccess can be resource-intensive when analyzing large log files. The weekly execution should be unproblematic for most Mail-in-a-Box users, but it is a point to keep in mind.

Conclusion

This pull request is an excellent addition to Mail-in-a-Box. It adds a feature requested by many users in a thoughtful and stable manner. The implementation is clean and complete. This pull request is recommended for merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants