Skip to content

Add Bulk Download Feature #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 19, 2025
Merged

Add Bulk Download Feature #4

merged 6 commits into from
Aug 19, 2025

Conversation

Michal-Babins
Copy link
Collaborator

Add class to do efficient and robust bulk downloads from ncbi and script to access it

- Implemented BulkNCBIGenomeDownloader for large-scale genome downloads with features like checkpointing, parallel downloads, and error handling.
- Created GtdbSpeciesCorrector to correct species classifications from GTDB to match NCBI taxonomy.
- Enhanced NCBIDatasetClient to support custom headers in API requests.
- Added command-line scripts for both bulk genome downloading and GTDB species correction.
- Updated setup.py to include psutil as a dependency.
@Michal-Babins Michal-Babins requested a review from xonq August 13, 2025 15:43
@Michal-Babins Michal-Babins changed the title Add Bulk Downoad Feature Add Bulk Download Feature Aug 18, 2025
@Michal-Babins Michal-Babins merged commit 9953560 into main Aug 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant