Optimize pmtiles generation even more

This issue is to describe different ways we could optimize generation of pmtiles:

Optimization for process memory:
- Read stop_times.txt via http and use chunking
   - Right now we copy the whole file to the in-memory file system then read it into the process. But we only need a subset of the information in this file. We could skip this copy operation by reading it in chunks viw http and we would save the in-memory disk space used.
- Download files on demand and delete them right after they are used
  - Currently we download all files in one step, then process them. the in-memory file system uses the process memory, pushing the requirements for memory higher.
  - When possible, we should download a file, process it, then delete it before downloading another file.
  - Example of a feed with over 8GB of file data: [mdb-2014](https://mobilitydatabase.org/feeds/mdb-2014) that cannot be processed right now.

Optimization for time:
- We are currently constrained by the fact that we use cloud tasks that have a timeout of 30 minutes when using an http target. We can go over this time with big feeds.
- TBD

Feel free to add to these lists. We can create separate issues as we work on items from these lists.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optimize pmtiles generation even more #1383

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Optimize pmtiles generation even more #1383

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions