This is a script that scrapes jobs.ch and itjobs.ch, based on keywords related to IT, to summarize, visualize, and filter interesting jobs.
Last Data Downloaded: 26 February, 2025
- A line plot showing the daily count of publications since 1.1.2025.
For more analysis, see analysis.md
Create a Conda environment called itjobs
from the environment.yml
file and install the required dependencies.
conda env create -f environment.yml
conda activate itjobs
If you want to download the latest data from jobs.ch and/or itjobs.ch run the scraper.
Alternatively, you can skip this step and use the already downloaded data in data/jobs.json
python src/scraper.py
To clean and extract important information from the raw file jobs.json, run the preprocessing script, which will create the jobs_processed.json
file in the data folder.
Alternatively, you can skip this step and use the already preprocessed data in data/jobs_preprocessed.json
python src/preprocessing.py
Warning: if you change preprocessing.py
, you should delete jobs_preprocessed.json
and run it again. Howeer, you will loose your ratings on the jobs.
You can review the downloaded jobs by running review. This each downloaded job (that is not rated so far) and the user can rate the job from 0-9.
python src/review.py