Skip to content

Conversation

@bennyz
Copy link
Member

@bennyz bennyz commented Nov 12, 2025

Add the --bmap argument to take in a bmap file.

For example:
./fls from-url -k --debug --bmap autosd.bmap https://download.autosd.sig.cento s.org/AutoSD-9/nightly/TI/auto-osbuild-am69sk-autosd9-qa-regular-aarch64-2154005 649.73882b72.raw.xz /dev/mmcblk1

This can arguably improve write performance

Summary by CodeRabbit

  • New Features
    • Added optional BMAP (Block Map) file support for block device writes, enabling efficient mapping of sparse image data to physical blocks.
    • Users can now provide a BMAP file path when writing flash images to block devices.
    • Enhanced block device detection and direct I/O handling across supported platforms.

@coderabbitai
Copy link

coderabbitai bot commented Nov 12, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR adds BMAP (Block Map) support for block device writing. It introduces a new BMAP XML parser, creates a BmapBlockWriter that intelligently maps decompressed data to device blocks, and integrates BMAP metadata through the CLI, options, and block writer abstraction layers.

Changes

Cohort / File(s) Summary
BMAP Parser
src/fls/bmap_parser.rs
New BMAP XML parser module with BmapRange and BmapFile types; implements parse() to extract image metadata and block ranges from XML, and is_range_mapped() to check byte-range overlap. Includes unit test for sample input.
Module Integration
src/fls/mod.rs
Adds module declaration for bmap_parser.
Block Writer Extensions
src/fls/block_writer.rs
Introduces BmapBlockWriter for BMAP-aware block device writing with range mapping and progress tracking; adds WriterEnum to unify BlockWriter and BmapBlockWriter; extends AsyncBlockWriter::new() to accept optional BMAP and select appropriate writer variant; updates block device detection and file opening logic.
BMAP File Loading
src/fls/from_url.rs
Integrates BMAP file loading and parsing; passes parsed BMAP to AsyncBlockWriter if options.bmap_file is provided.
Options & CLI
src/fls/options.rs, src/main.rs
Adds bmap_file: Option<String> field to BlockFlashOptions; adds bmap: Option<String> to FromUrl command; propagates BMAP path through argument parsing and options construction.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI/main.rs
    participant FromUrl as from_url.rs
    participant Options as BlockFlashOptions
    participant Writer as AsyncBlockWriter
    participant BmapWriter as BmapBlockWriter
    participant Parser as BmapParser
    participant Device as Block Device

    CLI->>FromUrl: bmap_file path (optional)
    FromUrl->>Parser: Load & parse BMAP XML
    Parser-->>FromUrl: BmapFile metadata
    FromUrl->>Options: Create options with bmap_file
    FromUrl->>Writer: new(bmap: Option<BmapFile>)
    
    alt BMAP provided
        Writer->>BmapWriter: Select BmapBlockWriter variant
        Note over BmapWriter: Maintains BMAP ranges<br/>& decompressed offset
    else No BMAP
        Writer->>Writer: Select BlockWriter variant
    end
    
    FromUrl->>Writer: write(decompressed_data)
    
    alt BMAP variant active
        Writer->>BmapWriter: write(data)
        BmapWriter->>BmapWriter: Iterate data ranges
        BmapWriter->>BmapWriter: Map ranges via BMAP
        BmapWriter->>Device: pwrite(mapped_offset, data)
        BmapWriter-->>Writer: track progress
    else BlockWriter variant active
        Writer->>Writer: write(data) sequential
        Writer->>Device: pwrite(sequential_offset)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • block_writer.rs: Review BMAP mapping logic in write(), block device detection, and WriterEnum delegation carefully for correctness and error propagation.
  • bmap_parser.rs: Verify XML parsing robustness, range overlap calculation accuracy, and edge case handling (empty ranges, malformed tags).
  • Integration points: Confirm BMAP data flows correctly from CLI → Options → AsyncBlockWriter, and that fallback to non-BMAP writer works as intended.
  • Test coverage: Check that test cases cover both BMAP and non-BMAP paths in block_writer; validate BMAP parser handles missing/invalid XML gracefully.

Poem

🐰 A rabbit hops through mapped blocks with glee,
BMAP paths sparkle, efficient and free,
No sparse waste on disks of our dreams,
Just mapped ranges flowing like streams! ✨🗺️

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'introduce bmap support' clearly and directly describes the main change: adding BMAP (Block Map) support throughout the codebase with new parser, writer, and CLI integration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Tip

📝 Customizable high-level summaries are now available!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide custom instructions to shape the summary (bullet lists, tables, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example:

"Create a concise high-level summary as a bullet-point list. Then include a Markdown table showing lines added and removed by each contributing author."


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Add the --bmap argument to take in a bmap file.

For example:
./fls from-url -k --debug --bmap autosd.bmap https://download.autosd.sig.cento
s.org/AutoSD-9/nightly/TI/auto-osbuild-am69sk-autosd9-qa-regular-aarch64-2154005
649.73882b72.raw.xz /dev/mmcblk1

This can arguably improve write performance

Signed-off-by: Benny Zlotnik <[email protected]>
Assisted-by: claude-sonnet-4.5
Signed-off-by: Benny Zlotnik <[email protected]>
@bennyz bennyz requested a review from mangelajo November 12, 2025 16:57
@bennyz
Copy link
Member Author

bennyz commented Nov 12, 2025

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 12, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/fls/block_writer.rs (1)

110-231: Significant code duplication with BmapBlockWriter::new().

The BlockWriter::new() method shares nearly identical file-opening logic with BmapBlockWriter::new() (Lines 336-448). Consider extracting the common file-opening logic into a shared helper function.

Example refactor:

fn open_device_file(
    device: &str,
    o_direct: bool,
    debug: bool,
    is_block_dev: bool,
    log_prefix: &str,
) -> io::Result<(std::fs::File, bool)> {
    // Common file opening logic for both Linux and macOS
    // ...
}

Then use it in both constructors:

let (file, use_direct_io) = open_device_file(device, o_direct, debug, is_block_dev, "BlockWriter")?;

This would reduce ~120 lines of duplicated code and make maintenance easier.

🧹 Nitpick comments (8)
src/fls/options.rs (1)

13-13: Consider using PathBuf for consistency.

The bmap_file field uses Option<String>, while cacert uses Option<PathBuf> at Line 5. For consistency and to leverage path-handling utilities, consider using PathBuf for file paths.

Apply this diff:

-    pub bmap_file: Option<String>,
+    pub bmap_file: Option<PathBuf>,

And update the Default implementation:

-            bmap_file: None,
+            bmap_file: None,

Note: You'll also need to update src/main.rs to convert the CLI string to PathBuf, and src/fls/from_url.rs to handle PathBuf instead of String.

src/fls/from_url.rs (2)

33-42: Consider using async file I/O.

The BMAP file is read synchronously using std::fs::read_to_string, which blocks the async runtime. Since this occurs early in the function and BMAP files are typically small, the impact is minimal. However, for consistency with async best practices, consider using tokio::fs::read_to_string instead.

Apply this diff:

-        let bmap_content = std::fs::read_to_string(bmap_path)
+        let bmap_content = tokio::fs::read_to_string(bmap_path).await
             .map_err(|e| format!("Failed to read BMAP file {}: {}", bmap_path, e))?;

517-517: Remove unnecessary semicolon.

The semicolon after the closing brace is unnecessary and inconsistent with Rust style conventions.

Apply this diff:

-    };
+    }
src/fls/bmap_parser.rs (4)

23-55: Consider using a proper XML parser library.

The current implementation manually parses XML using string operations, which is fragile and doesn't handle edge cases like CDATA sections, entity references, or malformed XML. Consider using a lightweight XML parser like quick-xml or roxmltree for more robust parsing.

This would improve:

  • Reliability: Handles malformed XML gracefully
  • Security: Protects against XML injection and XXE attacks
  • Maintainability: Clearer intent, less custom parsing code

Example with quick-xml:

use quick_xml::Reader;
use quick_xml::events::Event;

pub fn parse(xml_content: &str) -> io::Result<Self> {
    let mut reader = Reader::from_str(xml_content);
    reader.trim_text(true);
    
    // Parse using event-based API
    // ...
}

73-113: Manual XML comment stripping is fragile.

The custom comment-stripping logic is complex and may not handle nested or malformed comments correctly. A proper XML parser would handle this automatically and more reliably.

This reinforces the recommendation to use a proper XML parsing library mentioned earlier.


153-197: Manual range parsing has potential issues.

The manual XML parsing for <Range> elements doesn't handle:

  • Malformed XML (unclosed tags, nested elements)
  • Attributes in different orders
  • Whitespace variations in attribute values

Consider using a proper XML parser as mentioned earlier.


219-242: Add more comprehensive test coverage.

The single test case only validates basic parsing. Consider adding tests for:

  • Empty BMAP files
  • Missing required fields
  • Invalid block ranges (end < start)
  • Malformed XML
  • Comment handling
  • Large files with many ranges

Example additional test:

#[test]
fn test_parse_invalid_bmap() {
    let xml = r#"<bmap><ImageSize>invalid</ImageSize></bmap>"#;
    assert!(BmapFile::parse(xml).is_err());
}

#[test]
fn test_parse_missing_field() {
    let xml = r#"<bmap><ImageSize>1000</ImageSize></bmap>"#;
    assert!(BmapFile::parse(xml).is_err());
}
src/fls/block_writer.rs (1)

451-539: Confusing loop structure - always processes entire buffer in one iteration.

The while loop at Lines 457-536 is misleading. Since chunk_size = data.len() - data_offset (Line 458) and data_offset += chunk_size (Line 524), the loop always exits after processing the entire data buffer once. Consider simplifying this to make the intent clearer.

The loop structure suggests incremental processing, but it actually processes all data in one pass. Consider either:

  1. Remove the loop if single-pass processing is intended:
pub(crate) fn write(&mut self, data: &[u8]) -> io::Result<()> {
    use std::os::unix::io::AsRawFd;
    
    let fd = self.file.as_raw_fd();
    let chunk_start = self.decompressed_offset;
    let chunk_end = chunk_start + data.len() as u64;
    
    // Find all mapped ranges that overlap with this chunk
    for range in &self.bmap.ranges {
        // ... rest of logic
    }
    
    self.decompressed_offset += data.len() as u64;
    // ... progress and sync logic
    
    Ok(())
}
  1. Or make it truly incremental if you want to process in smaller chunks:
const WRITE_CHUNK_SIZE: usize = 1024 * 1024; // 1MB chunks

while data_offset < data.len() {
    let chunk_size = (data.len() - data_offset).min(WRITE_CHUNK_SIZE);
    // ... process chunk
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3eb55e and e710be6.

📒 Files selected for processing (6)
  • src/fls/block_writer.rs (12 hunks)
  • src/fls/bmap_parser.rs (1 hunks)
  • src/fls/from_url.rs (4 hunks)
  • src/fls/mod.rs (1 hunks)
  • src/fls/options.rs (2 hunks)
  • src/main.rs (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build (aarch64-unknown-linux-musl)
  • GitHub Check: Build (aarch64-unknown-linux-gnu)
  • GitHub Check: Build (x86_64-apple-darwin)
🔇 Additional comments (11)
src/fls/mod.rs (1)

3-3: LGTM!

The internal module declaration correctly integrates the BMAP parser into the codebase without exposing it publicly.

src/main.rs (2)

47-49: LGTM!

The CLI argument for BMAP is well-documented and correctly integrated into the command structure.


81-81: LGTM!

The BMAP option is correctly propagated from CLI parsing to BlockFlashOptions construction.

Also applies to: 133-133

src/fls/bmap_parser.rs (1)

116-150: LGTM with good error handling!

The tag parsing includes helpful error messages and whitespace handling. The debug output at Lines 134-136 aids troubleshooting.

src/fls/block_writer.rs (7)

87-97: LGTM!

The BmapBlockWriter struct is well-designed with appropriate fields for tracking BMAP-based writes and progress.


249-249: LGTM!

Updated comment accurately describes sequential write semantics.


297-302: LGTM with improved error handling!

The updated error handling includes offset context in error messages, which aids debugging.


464-520: Verify overlap calculation handles all edge cases.

The overlap detection and offset calculation logic (Lines 464-520) is complex. Ensure it correctly handles:

  • Writes spanning multiple BMAP ranges
  • Partial block overlaps at range boundaries
  • Decompressed data not aligned to block boundaries

Add tests to verify the BMAP write logic:

#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_bmap_writer_overlap_logic() {
        // Create a mock BMAP with known ranges
        // Write test data that spans multiple ranges
        // Verify correct portions are written to correct offsets
    }
}

Consider also running integration tests with real BMAP files to ensure correctness.


484-507: LGTM with proper pwrite error handling!

The pwrite usage correctly handles:

  • Negative return values (errors)
  • Partial writes (though unexpected for block devices)
  • Error propagation with context

The unsafe block is appropriately scoped and necessary for the pwrite syscall.


556-560: LGTM!

The WriterEnum provides a clean abstraction for unifying BlockWriter and BmapBlockWriter handling.


596-640: LGTM with clean enum-based dispatch!

The writer selection and dispatch logic cleanly handles both writer types using the WriterEnum. The error handling and progress tracking are correctly maintained for both paths.

Signed-off-by: Benny Zlotnik <[email protected]>
Copy link
Member

@mangelajo mangelajo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, IMHO we can merge it and keep iterating.

I think we need to look into the UX of this:

  1. accept bmap from an URL too
  2. auto-attempt to get a bmap for an image (i.e. if you try to flash https://xxx/os-img.xz ... look for https://xxx/os-img.bmap / os-img.bmap.xz

or other options :)

We could also combine bmap + raw server side into an simg and support simg ,many options.

Comment on lines 631 to 638
WriterEnum::Bmap(mut w) => {
w.flush()?;
Ok(w.bytes_written())
}
WriterEnum::Normal(mut w) => {
w.flush()?;
Ok(w.bytes_written())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably make the common parts a trait later, so there is no need for matching as long as they both implement the trait

Comment on lines +464 to +481
for range in &self.bmap.ranges {
let range_start_byte = range.start_block * self.bmap.block_size;
let range_end_byte = (range.end_block + 1) * self.bmap.block_size;

// Check if this chunk overlaps with this range
if chunk_start < range_end_byte && chunk_end > range_start_byte {
// Calculate the overlapping portion
let overlap_start = chunk_start.max(range_start_byte);
let overlap_end = chunk_end.min(range_end_byte);

if overlap_start < overlap_end {
// Calculate offsets within the chunk
let chunk_overlap_start = (overlap_start - chunk_start) as usize;
let chunk_overlap_end = (overlap_end - chunk_start) as usize;
let overlap_data = &chunk[chunk_overlap_start..chunk_overlap_end];

// Calculate device offset for this range
let device_offset = overlap_start;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all this overlap testing could be separated into some function for unit testing.

// Check if this is a block device
let is_block_dev = is_block_device(device).unwrap_or(false);
#[cfg(target_os = "linux")]
let (file, use_direct_io) = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably refactor all the open block to be in a function, as it's the same from regular block write (or almost)

@bennyz
Copy link
Member Author

bennyz commented Nov 17, 2025

Looks good, IMHO we can merge it and keep iterating.

I think we need to look into the UX of this:

  1. accept bmap from an URL too
  2. auto-attempt to get a bmap for an image (i.e. if you try to flash xxx/os-img.xz ... look for xxx/os-img.bmap / os-img.bmap.xz

or other options :)

We could also combine bmap + raw server side into an simg and support simg ,many options.

ack, addressed inline comments, we can do a follow up for the CLI UX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants