This project tracks and analyzes edit activity across Wikimedia projects. It consists of scripts for fetching, storing, and analyzing edit data, as well as a web interface for visualization.
- Purpose: Fetches monthly edit counts for all Wikimedia projects from the Wikimedia API and stores them in the
edit_countstable in the database. - How it works:
- Downloads edit count data for each project.
- Ensures the
edit_countstable exists. - Inserts or updates edit counts for each project and month.
- Intended use: Run regularly (e.g., as a cron job) to keep the edit counts up to date.
- Purpose: Similar to
fetch_and_store_cron.py; may be used for manual runs or testing. - How it works:
- Fetches edit counts from the Wikimedia API.
- Stores results in the
edit_countstable.
- Intended use: Manual or ad-hoc data fetching.
- Purpose: Detects peaks (unusual spikes) in edit activity for each project and stores these as alerts in the
community_alertstable. - How it works:
- Reads all edit data from
edit_counts. - Runs a peak detection algorithm for each project.
- Stores detected peaks in the
community_alertstable.
- Reads all edit data from
- Intended use: Run after edit data is up to date, to analyze and record significant activity spikes.
edit_counts: Stores raw monthly edit counts for each project.community_alerts: Stores detected peaks/alerts for each project.
- Python 3.7+
- MySQL server
- Virtual environment (recommended)
Ubuntu/Debian:
sudo apt update
sudo apt install mysql-server
sudo systemctl start mysql
sudo systemctl enable mysqlmacOS:
# Using Homebrew
brew install mysql
brew services start mysqlWindows:
- Download MySQL installer from mysql.com
- Run installer and follow setup wizard
- Start MySQL service
Note: For production setups, consider running
mysql_secure_installationto enhance security. For local development, this is optional.
-
Clone and set up environment:
git clone <repository-url> cd community-activity-alerts python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
-
Set up MySQL database:
Connect to MySQL:
# Connect as root (default) or your MySQL user mysql -u root -pCreate database and user:
-- Create the database CREATE DATABASE community_alerts; -- Create a user for the application CREATE USER 'wikim'@'localhost' IDENTIFIED BY 'wikimedia'; -- Grant privileges to the user GRANT ALL PRIVILEGES ON community_alerts.* TO 'wikim'@'localhost'; -- Apply changes FLUSH PRIVILEGES; -- Exit MySQL EXIT;
-
Configure database connection:
Edit the database connection settings in your Python files:
- In
app.py: Update theget_db_connection()function - In other scripts: Update the database configuration variables
# Example configuration (adjust as needed) conn = pymysql.connect( host="localhost", # Your MySQL host user='wikim', # Your MySQL username password='wikimedia', # Your MySQL password database="community_alerts", # Your database name charset="utf8mb4", )
- In
-
Create database tables:
The tables will be created automatically when you run the scripts for the first time. The application creates:
edit_counts: Stores raw monthly edit countscommunity_alerts: Stores detected activity peaks
-
Collect data:
python fetch_and_store_cron.py
This fetches 3 years of edit data for all Wikimedia projects (may take 30+ minutes).
-
Generate alerts:
python "Community alerts .py"This analyzes the data and detects activity peaks.
-
Start the web interface:
python app.py
Visit
http://localhost:5000to explore the data.
- MySQL connection issues: Verify MySQL is running and credentials are correct
- Permission errors: Ensure the MySQL user has proper privileges on the database
- Port conflicts: Default MySQL port is 3306, Flask runs on port 5000
- OS-specific MySQL setup: Refer to official MySQL documentation for your operating system
The web interface allows you to:
- Select different Wikimedia language communities and projects
- Set custom date ranges with an interactive slider
- View detected activity peaks in both table and chart format
- Click on chart peaks to add labels and annotations