@@ -11,21 +11,42 @@ This repository contains the code and data for our [paper](https://insert.link.w
1111
1212```
1313llm-stylometry/
14- ├── llm_stylometry/ # Python package with analysis tools
15- │ ├── core/ # Core experiment and configuration
16- │ ├── data/ # Data loading and tokenization
17- │ ├── models/ # Model utilities
18- │ ├── analysis/ # Statistical analysis
19- │ └── visualization/ # Plotting and visualization
20- ├── code/ # Original analysis scripts
21- ├── data/ # Datasets and results
22- │ ├── raw/ # Original texts from Project Gutenberg
23- │ ├── cleaned/ # Preprocessed texts by author
24- │ └── model_results.pkl # Consolidated model training results
25- ├── models/ # Model configurations and logs
26- └── paper/ # LaTeX paper and figures
27- ├── main.tex # Paper source
28- └── figs/ # Paper figures
14+ ├── .github/ # GitHub Actions CI/CD
15+ │ └── workflows/ # Test automation workflows
16+ ├── llm_stylometry/ # Python package with analysis tools
17+ │ ├── analysis/ # Statistical analysis utilities
18+ │ ├── core/ # Core experiment and configuration
19+ │ ├── data/ # Data loading and tokenization
20+ │ ├── models/ # Model utilities
21+ │ ├── utils/ # Helper utilities
22+ │ ├── visualization/ # Plotting and visualization
23+ │ └── cli_utils.py # CLI helper functions
24+ ├── code/ # Original analysis scripts
25+ │ ├── main.py # Model training script
26+ │ ├── clean.py # Data preprocessing
27+ │ └── ... # Various analysis scripts
28+ ├── data/ # Datasets and results
29+ │ ├── raw/ # Original texts from Project Gutenberg
30+ │ ├── cleaned/ # Preprocessed texts by author
31+ │ ├── model_results.pkl # Consolidated model training results
32+ │ └── model_results.csv # Model results in CSV format
33+ ├── models/ # Trained models (80 total)
34+ │ └── {author}_tokenizer=gpt2_seed={0-9}/
35+ ├── paper/ # LaTeX paper and figures
36+ │ ├── main.tex # Paper source
37+ │ ├── main.pdf # Compiled paper
38+ │ └── figs/ # Paper figures
39+ ├── tests/ # Test suite
40+ │ ├── data/ # Test data and fixtures
41+ │ ├── test_*.py # Test modules
42+ │ └── check_outputs.py # Output validation script
43+ ├── generate_figures.py # Main CLI entry point
44+ ├── run_llm_stylometry.sh # Shell wrapper for easy setup
45+ ├── LICENSE # MIT License
46+ ├── README.md # This file
47+ ├── requirements-dev.txt # Development dependencies
48+ ├── pyproject.toml # Package configuration
49+ └── pytest.ini # Test configuration
2950```
3051
3152## Installation
0 commit comments