Skip to content

Conversation

@AlessandroAnnini
Copy link

Add Folder Export Feature with URL Support

Summary

This PR adds a new folders command that enables exporting all pages within a Confluence folder and its subfolders to Markdown. The feature includes:

Core Functionality

  • Recursive folder export: Automatically downloads all pages from a folder and all nested subfolders at any depth
  • Flexible input methods: Supports both folder IDs and folder URLs (e.g., https://company.atlassian.net/wiki/spaces/MYSPACE/folders/123456)
  • Confluence REST API v2 integration: Uses /api/v2/folders/ endpoints for folder operations
  • Cursor-based pagination: Handles large folders with many children efficiently
  • Error resilience: Continues processing even if some subfolders are inaccessible

Usage

Export folder by ID:

confluence-markdown-exporter folders 3491123 ./output/

Export folder by URL (same as pages command):

confluence-markdown-exporter folders https://company.atlassian.net/wiki/spaces/MYSPACE/folders/3491123 ./output/

Export multiple folders (mixing IDs and URLs):

confluence-markdown-exporter folders 3491123 https://company.atlassian.net/wiki/spaces/MYSPACE/folders/3491456 ./output/

Implementation Details

New Folder class (confluence_markdown_exporter/confluence.py):

  • from_id(folder_id): Fetch and create Folder from ID
  • from_url(folder_url): Parse folder URL and create Folder from extracted ID (supports multiple URL patterns)
  • export(): Export all pages with warning for empty folders
  • pages property: Recursively collects all page IDs from folder hierarchy

API helper functions:

  • get_folder_by_id(): Fetches folder metadata using REST API v2
  • get_folder_children(): Fetches all children (pages and subfolders) with pagination

CLI integration (confluence_markdown_exporter/main.py):

  • New folders command following the same pattern as pages and spaces commands
  • Automatic detection of IDs vs URLs (checks for http:// or https:// prefix)
  • Optional --output-path parameter

Documentation:

  • Updated README.md with folder export section and examples
  • Clear explanation that export is recursive
  • Instructions on finding folder IDs in Confluence URLs

Test Plan

Unit Tests

  • 89 unit tests passing (added 16 new tests)
  • TestGetFolderById: Tests for folder fetching and 404 handling
  • TestGetFolderChildren: Tests for pagination, empty folders, and HTTP errors
  • TestFolderClass: Tests for Folder class methods including:
    • from_json() and from_id() creation methods
    • from_url() with multiple URL patterns (/spaces/SPACE/folders/ID, /spaces/SPACE/pages/folders/ID, generic patterns)
    • Invalid URL handling (raises ValueError)
    • pages property with recursive traversal
    • export() method with and without pages
    • Empty folder warning logging

Code Quality

  • ✅ All linting checks passed (ruff)
  • ✅ Google-style docstrings on all new functions
  • ✅ Type hints throughout
  • ✅ 100 character line length maintained
  • ✅ Follows existing code patterns and conventions

Manual Testing

The feature has been tested with:

  • Single folders with pages
  • Nested folder structures (multiple levels deep)
  • Empty folders (logs warning as expected)
  • Large folders requiring pagination
  • Mixed ID and URL inputs
  • Invalid folder URLs (proper error handling)

Test Infrastructure

  • Fixed tests/conftest.py to mock the module-level Confluence instance during test collection
  • Ensures tests don't require actual Confluence authentication

Changes

Modified files:

  • README.md: Added folder export documentation and examples
  • confluence_markdown_exporter/confluence.py: Added Folder class and API helpers (~160 lines)
  • confluence_markdown_exporter/main.py: Added folders command (~20 lines)
  • tests/conftest.py: Fixed test infrastructure for module-level imports
  • tests/unit/test_confluence.py: Added comprehensive test suite (~250 lines)
  • tests/unit/test_main.py: Updated command list test

Backward Compatibility

✅ This feature is fully backward compatible:

  • No changes to existing commands or APIs
  • No breaking changes to configuration
  • Existing exports continue to work unchanged
  • New command follows established patterns

- Add Folder class to handle folder exports with recursive traversal
- Add folders CLI command to export pages from one or more folders
- Implement get_folder_by_id and get_folder_children API helpers
- Add cursor-based pagination for folder children
- Add comprehensive unit tests for Folder class and API functions
- Update README with folder export documentation and examples
- Handle empty folders with warning messages
- Support multiple folder IDs in single command
- Add Folder.from_url() method to parse folder URLs and extract IDs
- Support multiple URL patterns: /spaces/SPACE/folders/ID, /spaces/SPACE/pages/folders/ID
- Update folders command to accept both IDs and URLs with automatic detection
- Add comprehensive unit tests for URL parsing (4 new tests covering different patterns)
- Update README.md with folder URL usage examples
- Fix test infrastructure in conftest.py to handle module-level Confluence instance

All 89 unit tests passing. Folder export now supports the same flexible input
methods as page exports, allowing users to easily export folders by copying
URLs from their browser.
@AlessandroAnnini
Copy link
Author

Fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants