Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 43 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,11 +208,50 @@ segments_df = get_gtfs_segments("path_to_gtfs_zip_file", parallel = True)

Alternatively, filter a specific agency by passing `agency_id` as a string or multiple agencies as list ["SFMTA",]

```
```python
segments_df = get_gtfs_segments("path_to_gtfs_zip_file",agency_id = "SFMTA")
segments_df
```

### Analyze Specific Dates

By default, `get_gtfs_segments` analyzes the busiest day in the GTFS schedule. You can specify a particular date/dates for analysis using the `date` parameter:

#### **Single Date Analysis**
```python
# Analyze a specific day using string format
segments_df = get_gtfs_segments("path_to_gtfs_zip_file", date="20240317") # YYYYMMDD format

# Using datetime.date object
from datetime import date
segments_df = get_gtfs_segments("path_to_gtfs_zip_file", date=date(2024, 3, 17))

# Combined with other parameters
segments_df = get_gtfs_segments(
"path_to_gtfs_zip_file",
agency_id="SFMTA",
date="20240317",
max_spacing=1000,
parallel=True
)
```

#### **Multiple Date Analysis**
```python
# Multiple dates using strings
segments_df = get_gtfs_segments(
"path_to_gtfs_zip_file",
date=["20220315", "20220316", "20220317"] # Tue, Wed, Thu
)

# Mixed date types (strings and date objects)
from datetime import date
segments_df = get_gtfs_segments(
"path_to_gtfs_zip_file",
date=["20220315", date(2022, 3, 16)]
)


<div align='center'><a>
<img src="https://raw.githubusercontent.com/UTEL-UIUC/gtfs_segments/main/images/data.jpg" alt="data" width=600>
</a></div>
Expand All @@ -223,7 +262,7 @@ Table generated by gtfs-segments using data from San Francisco’s Muni system.
3. `stop_id2`: The identifier of the segment's ending stop.
4. `route_id`: The same route ID listed in the agency's routes.txt file.
5. `direction_id`: The route's direction identifier.
6. `traversals`: The number of times the indicated route traverses the segment during the "measurement interval." The "measurement interval" chosen is the busiest day in the GTFS schedule: the day which has the most bus services running.
6. `traversals`: The number of times the indicated route traverses the segment during the "measurement interval." The "measurement interval" can be a single date, multiple dates (aggregated), or the busiest day in the GTFS schedule (default behavior).
7. `distance`: The length of the bus segment in meters.
8. `geometry`: The segment's LINESTRING (a format for encoding geographic paths) written in WGS84 (EPGS:4326) coordinates, that is, unprojected longitude-latitude pairs, as used in GTFS.
9. `traversal_time`: The time (in seconds) that it takes for the bus to traverse the segment.
Expand Down Expand Up @@ -454,3 +493,5 @@ Project Link: [https://github.com/UTEL-UIUC/gtfs_segments](https://github.com/UT
[issues-url]: https://github.com/UTEL-UIUC/gtfs_segments/issues
[license-shield]: https://img.shields.io/github/license/UTEL-UIUC/gtfs_segments.svg?style=for-the-badge
[license-url]: https://github.com/UTEL-UIUC/gtfs_segments/blob/master/LICENSE


57 changes: 54 additions & 3 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,58 @@ from gtfs_segments import get_gtfs_segments
segments_df = get_gtfs_segments("path_to_gtfs_zip_file")
```
Alternatively filter a specific agency by passing `agency_id` as a string or multiple agencies as list ["SFMTA",]
```
```python
segments_df = get_gtfs_segments("path_to_gtfs_zip_file",agency_id = "SFMTA")
segments_df
```

### Analyze Specific Dates

By default, `get_gtfs_segments` analyzes the busiest day in the GTFS schedule. The enhanced date functionality now supports comprehensive single and multi-date analysis with robust error handling:

#### **Single Date Analysis**
```python
# Analyze a specific Sunday using string format
segments_df = get_gtfs_segments("path_to_gtfs_zip_file", date="20240317") # YYYYMMDD format

# Using datetime.date object
from datetime import date
segments_df = get_gtfs_segments("path_to_gtfs_zip_file", date=date(2024, 3, 17))

# Combined with other parameters
segments_df = get_gtfs_segments(
"path_to_gtfs_zip_file",
agency_id="SFMTA",
date="20240317",
max_spacing=1000,
parallel=True
)
```

#### **Multiple Date Analysis**
```python
# Multiple dates using strings
segments_df = get_gtfs_segments(
"path_to_gtfs_zip_file",
date=["20220315", "20220316", "20220317"] # Tue, Wed, Thu
)

# Mixed date types (strings and date objects)
from datetime import date
segments_df = get_gtfs_segments(
"path_to_gtfs_zip_file",
date=["20220315", date(2022, 3, 16)]
)



#### **Performance & Technical Notes**

- **Service Aggregation**: When multiple dates are provided, services from all dates are combined and aggregated
- **Memory Usage**: Larger date ranges will use more memory for processing
- **Processing Time**: Multiple dates may increase processing time, but `parallel=True` helps optimize performance
- **Backward Compatibility**: All existing functionality remains unchanged; new parameters are optional
- **Date Validation**: Invalid dates are validated against the GTFS calendar and provide helpful error messages with available date ranges
<div align='center'><a>
<img src="https://raw.githubusercontent.com/UTEL-UIUC/gtfs_segments/main/images/data.jpg" alt="data" width=600>
</a></div>
Expand All @@ -60,7 +108,7 @@ Table generated by gtfs-segments using data from the San Francisco’s Muni syst
3. `stop_id2`: The identifier of the segment's ending stop.
4. `route_id`: The same route ID listed in the agency's routes.txt file.
5. `direction_id`: The route's direction identifier.
6. `traversals`: The number of times the indicated route traverses the segment during the "measurement interval." The "measurement interval" chosen is the busiest day in the GTFS schedule: the day which has the most bus services running.
6. `traversals`: The number of times the indicated route traverses the segment during the "measurement interval." The "measurement interval" can be a single date, multiple dates (aggregated), or the busiest day in the GTFS schedule (default behavior).
7. `distance`: The length of the segment in meters.
8. `geometry`: The segment's LINESTRING (a format for encoding geographic paths). All geometries are re-projected onto Mercator (EPSG:4326/WGS84) to maintain consistency.

Expand Down Expand Up @@ -103,7 +151,7 @@ summary_stats(segments_df,max_spacing = 3000,export = True,file_path = "summary.
## Get Route Summary Stats
```python
from gtfs_segments import get_route_stats,get_bus_feed
_,feed = get_bus_feed('path_to_gtfs.zip')
feed = get_bus_feed('path_to_gtfs.zip')
get_route_stats(feed)
```
Here each row contains the following columns:
Expand Down Expand Up @@ -134,3 +182,6 @@ export_segments(segments_df,'filename', output_format ='csv',geometry = False)
```

<p align="right">(<a href="#top">back to top</a>)</p>



73 changes: 39 additions & 34 deletions gtfs_segments/__init__.py
Original file line number Diff line number Diff line change
@@ -1,34 +1,39 @@
"""
The gtfs_segments package main init file.
"""
import importlib.metadata
from .geom_utils import view_heatmap, view_spacings, view_spacings_interactive
from .gtfs_segments import get_gtfs_segments, pipeline_gtfs, process_feed
from .mobility import (
download_latest_data,
fetch_gtfs_source,
summary_stats_mobility,
)
from .partridge_func import get_bus_feed
from .route_stats import get_route_stats
from .utils import export_segments, plot_hist, process, summary_stats

__version__ = importlib.metadata.version("gtfs_segments")
__all__ = [
"__version__",
"get_gtfs_segments",
"pipeline_gtfs",
"process_feed",
"export_segments",
"plot_hist",
"fetch_gtfs_source",
"summary_stats",
"process",
"view_spacings",
"view_spacings_interactive",
"view_heatmap",
"summary_stats_mobility",
"download_latest_data",
"get_route_stats",
"get_bus_feed",
]
"""
The gtfs_segments package main init file.
"""
import importlib.metadata
from .geom_utils import view_heatmap, view_spacings, view_spacings_interactive
from .gtfs_segments import get_gtfs_segments, pipeline_gtfs, process_feed
from .mobility import (
download_latest_data,
fetch_gtfs_source,
summary_stats_mobility,
)
from .partridge_func import get_bus_feed
from .route_stats import get_route_stats
from .utils import export_segments, plot_hist, process, summary_stats

try:
__version__ = importlib.metadata.version("gtfs_segments")
except importlib.metadata.PackageNotFoundError:
# Fallback version for development/testing when package is not installed
__version__ = "dev"

__all__ = [
"__version__",
"get_gtfs_segments",
"pipeline_gtfs",
"process_feed",
"export_segments",
"plot_hist",
"fetch_gtfs_source",
"summary_stats",
"process",
"view_spacings",
"view_spacings_interactive",
"view_heatmap",
"summary_stats_mobility",
"download_latest_data",
"get_route_stats",
"get_bus_feed",
]
Loading