Skip to content

Conversation

h3lix1
Copy link
Contributor

@h3lix1 h3lix1 commented Sep 23, 2025

WARNING: This was vibe coded from gpt5-codex-high. It seems to work, and looking at it initially, it isn't bad. (Trust, this is better than most of my other c++ work)

I have tested this over the last day. I have found bugs with counters, but those are squished.

Note, "online" counter updates after it receives time from NTP/GPS, and will initially show a higher number on boot.

There is a bubble sort used for nodedb. It's completing normally within 3-4ms, but sometimes jumps to 11ms. This seems OK, but willing to accept advice here.

Node Hot/Cold Split

ESP32‑S3 builds now keep the 196 B meshtastic_NodeInfoLite payload in PSRAM using a custom allocator that calls heap_caps_malloc(MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT) (src/mesh/NodeDB.h:20, src/mesh/NodeDB.cpp:73).
DRAM carries only the latency-critical fields in a NodeHotEntry cache (~20 B per node: num, last_heard, snr, channel/flags) alongside dirtiness bits for sync-on-demand (src/mesh/NodeDB.h:33, src/mesh/NodeDB.cpp:78).
Sorting, routing, favorite flips, online counts, and next-hop decisions run entirely out of that hot cache, so the usual packet/route/UI fast paths stay in internal RAM.

Memory Footprint per Node (bytes)

  DRAM PSRAM Total
Existing build 196 0 196
New split 20 196 216
Net change -176 +196 +20

Capacity & Secondary Effects:

  • MAX_NUM_NODES is a max 5000 nodes as long as psram size is > 2, otherwise the old flash-based limits apply (src/mesh/mesh-pb-constants.h:54).

  • The packet history ring still targets max(MAX_NUM_NODES*2, …) entries (src/mesh/PacketHistory.cpp:11), so doubling the node ceiling means the history structure grows accordingly—keep an eye on overall PSRAM consumption if future caps rise again.

Serialization & Cold Access:

  • NodeDB save/load moves through PSRAM: hot nodes are copied into a temporary vector before protobuf encoding, then cleared back out after disk writes (src/mesh/NodeDB.cpp:1322, src/mesh/NodeDB.cpp:1414).
  • GUI detail panes, phone syncs, and other “profile” views touch PSRAM when they dereference cold fields. These flows happen far less frequently, so the added latency is acceptable.

Runtime Behavior:

  • Fast paths (unchanged timing): neighbor sorting, routing decisions, getMeshNodeChannel, set_favorite, online counts, packet next-hop updates (src/mesh/NodeDB.cpp:1750, src/mesh/NodeDB.cpp:1939, src/mesh/NodeDB.cpp:2156).
  • Potentially slower paths: full NodeInfo dumps to the phone, detail panels that copy cold payloads, database saves—each now copies between DRAM and PSRAM but only on demand.

Large Mesh Readiness:

  • Telemetry and UI counters are widened to 16 bit, keeping online/total counts accurate past 255 nodes (src/NodeStatus.h:16, src/mesh/ProtobufModule.h:16).
  • InkHUD map passes now iterate with size_t, so they handle the full PSRAM-backed node list without truncation (src/graphics/niche/InkHUD/Applets/Bases/Map/MapApplet.cpp:156).
  • The result is a Station G2-class ESP32‑S3 node that can track ~5000 peers, with 100KB max total of hot metadata in DRAM and ~1MB cold payload in PSRAM.

Also recently added:

PSRAM-aware message/node expansion

  • Gate the “moar messages / moar nodes” knobs behind has_psram() so we only scale up when ≥2 MB of PSRAM is available.
  • Move the MeshPacket pool into a PSRAM-backed allocator on ESP32-S3; if allocation fails we fall back to heap so radios keep working.
  • Bump the BLE message queue to 200 entries (356 B each → ~84.7 KB in PSRAM plus ~0.8 KB of DRAM for the pointer ring) while keeping a runtime limit that collapses back to 32 messages if PSRAM isn’t present.
  • Reuse the same helper for NodeDB sizing, so nodes stay capped on low-memory boards without more ESP.getPsramSize() calls.

🤝 Attestations

  • I have tested that my proposed changes behave as described.
  • I have tested that my proposed changes do not cause any obvious regressions on the following devices:
    • Heltec (Lora32) V4
    • LilyGo T-Deck
    • LilyGo T-Beam
    • RAK WisBlock 4631
    • Seeed Studio T-1000E tracker card (Does not add/subtract functionality)
    • Other (please specify below)
      Station G2
      LILYGO Pager

@thebentern thebentern requested a review from Copilot September 23, 2025 20:34
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a hot/cold memory split architecture for ESP32-S3 devices to support tracking up to 800 nodes by moving NodeInfoLite payloads to PSRAM while keeping critical routing data in DRAM.

  • Implements custom PSRAM allocator for ESP32-S3 that stores full NodeInfoLite objects (196B each) in external memory
  • Creates NodeHotEntry cache in DRAM containing only essential fields (20B per node) for fast access during routing operations
  • Widens counter types from uint8_t to uint16_t to handle node counts beyond 255

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/modules/AdminModule.cpp Updates favorite node operations to use new NodeDB API instead of direct field access
src/mesh/mesh-pb-constants.h Changes MAX_NUM_NODES calculation to prioritize PSRAM size over flash size for ESP32-S3
src/mesh/ProtobufModule.h Widens numOnlineNodes counter from uint8_t to uint16_t
src/mesh/NodeDB.h Adds PSRAM allocator, NodeHotEntry structure, and hot/cold cache management methods
src/mesh/NodeDB.cpp Implements complete hot/cold split logic with cache synchronization and PSRAM-aware operations
src/graphics/niche/InkHUD/Applets/Bases/Map/MapApplet.cpp Changes loop variables from uint8_t to size_t for handling larger node counts
src/NodeStatus.h Widens all node counter types from uint8_t to uint16_t

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@NomDeTom
Copy link
Contributor

NomDeTom commented Sep 24, 2025

Is this extensible to the extra qspi flash on the xiao NRF52?

https://wiki.seeedstudio.com/xiao-ble-qspi-flash-usage/

@h3lix1
Copy link
Contributor Author

h3lix1 commented Sep 25, 2025

@NomDeTom I'm not sure we want to slow the nrf52 down any more than it is already.

@NomDeTom
Copy link
Contributor

I'm not sure we want to slow the nrf52 down any more than it is already.

I was just thinking of things like nodeDB rolling attacks could be resisted more easily by increasing the size. I'm not sure this would slow it down particularly.

@h3lix1 h3lix1 marked this pull request as draft September 26, 2025 02:00
@h3lix1
Copy link
Contributor Author

h3lix1 commented Sep 27, 2025

I currently lack the skill required to make this work for nrf52. Placing in draft for now until someone more talented than I am can make this work.

@h3lix1 h3lix1 changed the title Add moar nodedb nodes on esp32 s3 to 800 Add moar nodedb nodes on esp32 w/ psram to 800, stored messages to 200 Sep 28, 2025
@h3lix1 h3lix1 closed this Sep 28, 2025
@h3lix1 h3lix1 deleted the moar_nodes_esp32_s3 branch September 28, 2025 05:55
@h3lix1 h3lix1 restored the moar_nodes_esp32_s3 branch September 28, 2025 06:00
@h3lix1 h3lix1 reopened this Sep 28, 2025
@h3lix1 h3lix1 marked this pull request as ready for review September 28, 2025 06:08
@h3lix1
Copy link
Contributor Author

h3lix1 commented Sep 28, 2025

I currently have a $50 bounty out for anybody better than me who can do this for nrf52 nodes with flash. In the meantime, can we get this in at least for the ESP32s out there?

@garthvh
Copy link
Member

garthvh commented Sep 28, 2025

Can't rush this in, 200 nodes is already problematic and entirely vibe coded solutions are generally buggy, get some people testing builds for this.

@h3lix1
Copy link
Contributor Author

h3lix1 commented Oct 3, 2025

More testing complete on this MR and changes since the first revision

  • Store 100 messages (up from 32) for BOARD_MAX_RX_TOPHONE that will be stored in PSRAM
  • Storing 3000 nodes in NodeDB. This might be problematic for bluetooth LE nodes
  • Wifi is very fast and is able to load 700 nodes in less than a second

Comparing to the development branch, even with 3000 nodes, it is saving about 16% heap memory. All memory is allocated ahead of time.

Phase Field Non-PSRAM (bytes / MB) PSRAM (bytes / MB) Difference (bytes / MB) % diff (PSRAM vs Non)
Running Free heap 176,700 bytes / 0.17 MB 204,648 bytes / 0.20 MB +27,948 bytes / +0.03 MB +15.82%
Running Free PSRAM 2,035,403 bytes / 1.94 MB 1,278,319 bytes / 1.22 MB −757,084 bytes / −0.72 MB −37.20%
  • Additional log message during bootup
    INFO | ??:??:?? 2 NodeDB PSRAM backing at 0x3de00800 (DRAM) capacity 3000 entries (~588000 bytes)

Testing on a production router has proven successful with no reboots and 661 nodes currently.

So far this change has been tested successfully on the following platforms:
[x] Heltec (Lora32) V4
[x] Seeed Studio T-1000E tracker card (Does not add/subtract functionality)
[x] Station G2
[x] T-Lora Pager
[x] T-Beam-S3Core
[] T-Deck (awaiting delivery)

With 2MB of PSRAM this will use 37%. With 8MB of most ESP32-S3 nodes.

Next is to move PacketRecord to PSRAM for a savings of about 120KB with NUM_MAX_NODES == 3000, making the ring 6000 entries. For now it fits in DRAM.

Moving back to draft for now, but this is looking very good.

@h3lix1 h3lix1 marked this pull request as draft October 3, 2025 07:52
@garthvh
Copy link
Member

garthvh commented Oct 3, 2025

200 nodes is slow over WiFi

@h3lix1
Copy link
Contributor Author

h3lix1 commented Oct 5, 2025

@garthvh For me it's very fast with the Lora V4 and Xiao Wio. Bluetooth is a different beast and can take a few minutes to get 400 nodes. It seems like a lazy population might be better for bluetooth nodes if the client side can support that. (multiple queues possibly?) . I guess we can just limit this to wifi enabled nodes, or only send the most recently heard 100 nodes if on bluetooth to the client.

@garthvh
Copy link
Member

garthvh commented Oct 5, 2025

Needs to be compatible with the 90% of people using Bluetooth, TCP is also pretty slow 800-3000 nodes seems really optimistic in real world use.

@h3lix1
Copy link
Contributor Author

h3lix1 commented Oct 5, 2025

This MR solves the problem of not having a large enough nodedb. I don't see 3000 nodes as being too much of a problem as the memory is all pre-allocated and leaves enough for everything else, but doing the large dump of the DB when connecting over bluetooth is a problem. I'm not sure why you're finding the wifi to be slow, as I can download the DB very quickly, but maybe my wifi is special.

In my previous message I'm trying to provide solutions to the large DB problem. We can have the node download the last 100 heard, and then do a fair share queue between node info updates and other incoming updates. Other options include doing a comparison of blocks of node IDs and share the ones that are missing. Any other thoughts?

The bay mesh currently cycles through 358 nodes every 3 hours, 500 every day, and currently up to 716 total over the last 8 days. I am guessing this will be towards 900 or 1000 at the end of this year, 2k at the end of next year. Add in some events, and 3k doesn't seem unreasonable as a goal to expand to.

I like this change, and I think it is the absolute best way to increase node counts for nodes with PSRAM while decreasing heap utilization. The problem is the communication between phone and mesh device also needs a refresh to support large data dumps.

I see this as the beginning of downloading much larger objects over time without needing a phone always attached. This, plus the ability for reliable message delivery, makes for the ability to transfer images, or other binary data, without impacting realtime communications. Large NodeDB just happens to be the first use-case that requires some kind of fair share mechanism.

@NomDeTom
Copy link
Contributor

NomDeTom commented Oct 5, 2025

@garthvh I thought there were recent optimisations to the app code, to bring the nodeDB over after initial handshake? If this is a way to slow the nodeDB rolling in a big mesh, it seems useful, if not advisable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants