Skip to content

Doing a full erase as part of InternalFS->format #848

Open
@obra

Description

@obra

Operating System

MacOS

IDE version

arduino-cli 1.1.1

Board

Any

BSP version

github latest

Sketch

Not sketch related

What happened ?

I've been trying to chase down an InternalFS corruption bug in a file that is 100% my fault. (I'm not looking for help with that.)

Because the corruption is inside the ctz structure for one of my files, mounting the FS works fine, but reading that file using the normal lfs methods triggers an assert.

I've been working on an 'fsck' implementation that my code can run at boot to verify superblocks and directory blocks and file allocations without tripping over the lfs assertions. The problem I run into is when I hit corruption in the directory structure or superblocks.

One of the obvious things I'd expected to need to do when I found bad corruption was to format InternalFS by calling ->format on it, but I saw some weird data issues that I couldn't characterize easily after the format.

As I started tracing code, I noticed how InternalFS dealt with corruption bad enough to stop filesystem mount:

 // failed to mount, erase all sector then format and mount again
  if ( !Adafruit_LittleFS::begin() )
  {
    // Erase all sectors of internal flash region for Filesystem.
    for ( uint32_t addr = LFS_FLASH_ADDR; addr < LFS_FLASH_ADDR + LFS_FLASH_TOTAL_SIZE; addr += FLASH_NRF52_PAGE_SIZE )
    {
      VERIFY( flash_nrf5x_erase(addr) );
    }

    // lfs format
    this->format();

    // mount again if still failed, give up
    if ( !Adafruit_LittleFS::begin() ) return false;
  }

A "regular" format looks like it's manually erasing the superblocks and directory blocks using InternalFS' _internal_flash_erase which calls flash_nrf5x_write8(addr + i, 0xFF) as it writes them but isn't clearing out the whole filesystem's storage like the "so bad we couldn't mount" code in begin does:

for ( uint32_t addr = LFS_FLASH_ADDR; addr < LFS_FLASH_ADDR + LFS_FLASH_TOTAL_SIZE; addr += FLASH_NRF52_PAGE_SIZE )
    {
      VERIFY( flash_nrf5x_erase(addr) );
    }
    this->format();

Would it make sense for InternalFS->format to always perform that "full" erase? If so, I'd be happy to put together a PR.

And if it's not necessary because format is actually doing the right thing and cleaning up all the blocks in another way that I'm not seeing, would it make sense for begin to drop that extra erase cycle?

If this is something that's "sometimes required, but not always", would you be open to a PR that exposes the 'deep' erase code in a way that a sketch can call directly? (Basically lifting it out of InternalFS->begin into its own method.)

Thanks!

How to reproduce ?

(I don't have a repro for my weird corruption issues. They were happening during factory programming and I haven't managed a reliable way to cause them in the lab without power glitching)

Debug Log

No response

Screenshots

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions