Daily Bulletin Scrapers

This repository contains Python scripts that download, parse, and append “company news” items from daily bulletins published by various Turkish brokerage firms. Each script retrieves the PDF bulletin, extracts relevant text using pdfminer, performs matching of ticker symbols against a reference CSV of BIST-traded stocks, and appends the parsed news items to a master CSV file.

oyak_yatirim.py: Downloads and parses daily bulletins from Oyak Yatırım Menkul Değerler A.Ş..
piramit_yatirim.py: Downloads and parses daily bulletins from Piramit Menkul Kıymetler A.Ş., using Selenium to navigate to their site.
tacirler_yatirim.py: Downloads and parses daily bulletins from Tacirler Yatırım Menkul Kıymetler A.Ş., also using Selenium.
vakif_yatirim.py: Downloads and parses daily bulletins from Vakıf Yatırım Menkul Değerler A.Ş., using Selenium.
ziraat_yatirim.py: Downloads and parses daily bulletins from Ziraat Yatırım Menkul Değerler A.Ş..

Each script extracts the relevant “company news” section from the PDF, identifies the ticker symbols, and appends results into the shared CSV file.

How It Works

Script Execution: Each script defines a function (e.g., oyak_yatirim()) which you can import or run directly. When run:
- It attempts to download the day’s PDF from the brokerage’s website.
- Uses pdfminer to convert the PDF to text.
- Extracts “company news” subsections via regular expressions.
- Splits out valid ticker symbols and uses fuzzy matching (via fuzzywuzzy) to ensure a valid BIST stock code.
- Appends the results, including date, time, broker name, and the bulletin’s URL, to master CSV.
Data Sources: A CSV file containing valid BIST tickers. Each script uses fuzzy matching on this list to ensure correct identification of ticker symbols
Error Handling: If a PDF pattern or URL changes on a brokerage site, the script might fail to download (e.g., a PDFSyntaxError or a NoSuchElementException in Selenium-based scripts).
If no “company news” is found, each script logs that no news is available for the day.

Requirements

Python 3.7+
pip packages
- requests
- pdfminer
- fuzzywuzzy
- pandas
- numpy
- selenium
Chromedriver

Feel free to open issues or submit pull requests if you discover bugs or want to add new features.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Daily Bulletin Scrapers

Contents

How It Works

Requirements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
oyak_yatirim.py		oyak_yatirim.py
piramit_yatirim.py		piramit_yatirim.py
tacirler_yatirim.py		tacirler_yatirim.py
vakif_yatirim.py		vakif_yatirim.py
ziraat_yatirim.py		ziraat_yatirim.py

merveogretmek/daily-bulletin-scrapers

Folders and files

Latest commit

History

Repository files navigation

Daily Bulletin Scrapers

Contents

How It Works

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages