Skip to content

oshp/oshp-stats

OWASP Secure Headers Project statistics

Gather data

📊 Statistics about HTTP response security headers usage mentioned by the OWASP Secure Headers Project (OSHP).

💾 This project gather data, about the usage of HTTP response security headers, into a SQLITE database to allow the generation of statistics in a second time.

💡 See this issue for details.

Data source

Tip

💡 MAJESTIC was used instead of the CISCO Top 1 million sites CSV file because it contain less malware domains.

# Download the MAJESTIC Top 1 million sites CSV file
$ wget http://downloads.majestic.com/majestic_million.csv
# Transform the downloaded file to an input source that use the same format 
# than the CISCO Top 1 million sites CSV file
$ cat majestic_million.csv | awk -F  "," 'NR>1 {print $1 "," $3}' > data/input.csv
$ rm majestic_million.csv

Scripts

Note

📦 They are all stored in the scripts folder and they are Python 3.x based.

Important

⚠️ Usage of the script generate_stats_md_file was replaced by a workflow on the main OSHP site..

💻 Visual Studio Code is used for the scripts development. A Visual Studio Code workspace file is provided for the project with recommended extensions.

📑 Files:

  • gather_data: Script gathering the information about HTTP security headers usage in a SQLITE database based on the "MAJESTIC Top 1 million sites CSV file" data source.
  • generate_stats_md_file: Script using the gathered data to generate/update the markdown file stats, with mermaid pie charts with differents statistics about HTTP security headers usage (⚠️not used anymore).

Data

Note

📦 They are all stored in the data folder.

📑 Files:

  • input.csv: MAJESTIC Top 1 million sites list formated as one entry ranking,domain by line.
  • data.db: SQLITE database with information about HTTP security headers usage.

Data and statistics update

Note

💡 Only the first 150000 entries of the CSV datasource are used to fit the processing timeframe allowed for a github action workfows using the free tiers.

💻 The update is scheduled in the following way:

  • The first day of every month the data database is updated via this workflow.
  • The fifth day of every month the statistic data is updated via this workflow.

Note

About

Stats about HTTP response security headers usage mentioned by the OSHP.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages