Skip to content

Conversation

lidel
Copy link
Member

@lidel lidel commented Sep 19, 2025

lidel and others added 30 commits August 21, 2025 19:51
* fix: harness tests random panic

Connecting nodes in parallel can cause TLS handshake failures. For each node, connect to the other nodes serially. It is not necessary to connect in parallel as it does not save any significant time.

Closes #10932
* Tests: disable telemetry in tests by default

Disable the plugin in cli tests and sharness by default. Enable only in
telemetry tests.

There are cases when tests get stuck or get killed and leave daemons hanging around. We don't want to be getting telemetry from those.

* sharness: attempt to fix

* sharness: add missing --bool flag

* fix(ci): add omitempty to Plugin.Config field

The sharness problem is that when the telemetry plugin is configured
initially with 'ipfs config --bool', it creates a structure without
the 'Config: null' field, but when the config is copied and replaced,
it expects the structure to be preserved.

Adding omitempty ensures the Config field is omitted from JSON when
nil, making the config structure consistent between initial creation
and replacement operations.

---------

Co-authored-by: Marcin Rataj <[email protected]>
* feat(ci): docker linting

adds hadolint to validate dockerfile best practices
configures project-specific rules in .hadolint.yaml

* fix(ci): enable hadolint console output

adds verbose and tty format to see linting results in CI logs

* test: trigger hadolint warning

remove --no-install-recommends to test CI output

* fix(ci): fail hadolint on warnings

stricter linting to catch all best practice violations

* fix: add --no-install-recommends to apt-get

reduces image size by avoiding unnecessary packages

* refactor: use WORKDIR instead of cd in dockerfile

replaces cd commands with WORKDIR for cleaner dockerfile
removes unnecessary hadolint ignore rules DL3003 and DL3009

* chore: simplify hadolint config

removes unnecessary override rules for cleaner config
keep -dev version from master and add v0.38 to changelog
* chore: simplify and update

* docs: streamline release checklist

- reduce from 128 to 113 lines while preserving all information
- add clarity on command execution context (which branch/directory)
- improve dangerous operation warnings (point of no return)
- consolidate prerequisites into checklist format
- clarify cherry-pick -x rationale (traceability and deduplication)
- explain why release branches are kept (patch releases)
- add fetch step to prevent stale branch issues
- simplify social media into single optional item
- add STOP checkpoint after tag push for docker verification

* docs: update release checklist last updated reference

* docs: remove GOPATH requirement from release checklist

mkreleaselog now works from any directory where kubo is checked out

* docs: clarify merge process in release checklist

add step to merge master into merge-release branch first
and specify "Create a merge commit" option explicitly

* docs: address PR #10870 feedback

- update forum example to v0.37.0
- add social media post examples from v0.36.0
- add example for creating next release issue

* docs: update blog example to kubo v0.36.0 commit
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5.4.3 to 5.5.0.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@18283e0...fdcc847)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-version: 5.5.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Gillis <[email protected]>
* fix ctrl-c prompt during run migrations prompt

At "Run migrations" prompt, hitting ctrl-c does not show the "(Hit ctrl-c again to force-shutdown the daemon.)" prompt, even though a 2nd ctrl-c does cause the daemon to exit.

The problem is that the goroutine that displays the prompt only run after the prompt to "Run migrations", so hitting ctrl-c during the "Run migrations" prompt does not show the "(Hit ctrl-c again to force-shutdown the daemon.)" prompt.

This PR fixes the problem by starting the goroutine to display the "(Hit ctrl-c again to force-shutdown the daemon.)" prompt sooner, before the "Run mirgarions" prompt. Additionally, this PR also disables showing the ctrl-c prompt if the daemon function has already exited, in which case only the exit message should be shown.

Closes #3157

* exit immediately if ctrl-c during migration prompt
* feat(gateway): add DiagnosticServiceURL config

- add Gateway.DiagnosticServiceURL to kubo config
- pass diagnostic service URL to boxo gateway
- document new config option in docs/config.md
- default to https://check.ipfs.network

* docs(changelog): add gateway error UX improvements to v0.38

- document improved 504 error pages with retrieval diagnostics
- highlight new Gateway.DiagnosticServiceURL config option
- include screenshot showing enhanced error page UX
* telemetry: use systemd-detect-virt for container/vm detection

Current VM detection is not very accurate and systemd-detect-virt does exactly
what's needed under a miriad of virtualization platforms.

The downside is that we are running a system command which is uglier and might
perhaps flip anti-viruses or something.

* telemetry: improve vm/container detection with pure go

replace systemd-detect-virt with file-based detection to avoid:
- security risks from executing external binaries
- unnecessary repeated detection (now cached with sync.Once)
- missing detection on non-systemd systems

removes false positives:
- cpu hypervisor flag (indicates capability, not guest status)
- generic dmi strings that match physical hardware
- overlay filesystem check (used by immutable distros)

Co-authored-by: Andrew Gillis <[email protected]>
Co-authored-by: Marcin Rataj <[email protected]>
validates Import configuration fields to prevent invalid values:
- CidVersion: must be 0 or 1
- UnixFSFileMaxLinks: must be positive
- UnixFSDirectoryMaxLinks: must be non-negative
- UnixFSHAMTDirectoryMaxFanout: power of 2, multiple of 8, ≤ 1024
- BatchMaxNodes/BatchMaxSize: must be positive
- UnixFSChunker: validates format patterns
- HashFunction: must be allowed by verifcid
* fix: enforce identity CID size limits

- validate --inline-limit against verifcid.MaxDigestSize
- add error when --hash=identity exceeds size limit
- add tests for identity CID overflow scenarios
- update help text to show maximum inline limit

This prevents creation of unbounded identity CIDs by enforcing
the 128-byte limit defined in ipfs/boxo#1018

Fixes #6011
IPIP: ipfs/specs#512
* rpc: don't reuse object during decode

Retaining the object during the loop will make fields such as `Name`
stick between iterations.
This patch decodes into a new struct each iteration, assuring we don't
retain values from other pins.

* rpc: use the Detailed option during request
* reprovide sweep draft

* update reprovider dep

* go mod tidy

* fix provider type

* change router type

* dual reprovider

* revert to provider.System

* back to start

* SweepingReprovider test

* fix nil pointer deref

* noop provider for nil dht

* disabled initial network estimation

* another iteration

* suppress missing self addrs err

* silence empty rt err on lan dht

* comments

* new attempt at integrating

* reverting changes in core/node/libp2p/routing.go

* removing SweepingProvider

* make reprovider optional

* add noop reprovider

* update KeyChanFunc type alias

* restore boxo KeyChanFunc

* fix missing KeyChanFunc

* test(sharness): PARALLEL=1 and timeout 30m

running sequentially to see where timeout occurs

* initialize MHStore

* revert workflow debug

* config

* config docs

* merged IpfsNode provider and reprovider

* move Provider interface to from kad-dht to node

* moved Provider interface from kad-dht to kubo/core/node

* mod_tidy

* Add Clear to Provider interface

* use latest kad-dht commit

* make linter happy

* updated boxo provide interface

* boxo PR fix

* using latest kad-dht commit

* use latest boxo release

* fix fx

* fx cyclic deps

* fix merge issues

* extended tests

* don't provide LAN DHT

* docs

* restore dual dht provider

* don't start provider before it is online

* address linter

* dual/provider fix

* add delay in provider tests for dht bootstrap

* add OfflineDelay parameter to config

* remove increase number of workers in test

* improved keystore gc process

* fix: replace incorrect logger import in coreapi

replaced github.com/labstack/gommon/log with the standard
github.com/ipfs/go-log/v2 logger used throughout kubo.
removed unused labstack dependency from go.mod files.

* fix: remove duplicate WithDefault call in provider config

* fix: use correct option method for burst workers

* fix: improve error messages for experimental sweeping provider

updated error messages to clearly indicate when commands are unavailable
due to experimental sweeping provider being enabled via Reprovider.Sweep.Enabled=true

* docs: remove obsolete KeyStoreGCInterval config

removed from config.md as option no longer exists (removed in b540fba)
updated keystore description to reflect gc happens at reprovide interval

* docs: add TODO placeholder changelog for experimental sweeping DHT provider

using v0.38-TODO.md name to avoid merge conflicts with master branch
and allow CI tests to run. will be renamed to v0.38.md once config
migration is added to the PR

* fix: provideKeysRec go routine

* clear keystore on close

* fix: datastore prefix

* fix: improve error handling in provideKeysRec

- close errCh channel to distinguish between nil and pending errors
- check for pending errors when provided.New closes
- handle context cancellation during error send
- prevent race condition where errors could be silently lost

this ensures DAG walk errors are always propagated correctly

* address gammazero's review

* rename BurstProvider to LegacyProvider

* use latest provider/keystore

* boxo: make mfs StartProviding async

* bump boxo

* chore: update boxo to f2b4e12fb9a8ac138ccb82aae3b51ec51d9f631c

- updated boxo dependency to specified commit
- updated go.mod and go.sum files across all modules

* use latest kad-dht/boxo

* Buffered SweepingProvider wrapper

* use latest kad-dht commit

* allow no DHT router

* use latest kad-dht & boxo

---------

Co-authored-by: Marcin Rataj <[email protected]>
Co-authored-by: gammazero <[email protected]>
…o Provide.DHT (#10951)

* refactor: consolidate Provider/Reprovider into unified Provide config

- merge Provider and Reprovider configs into single Provide section
- add fs-repo-17-to-18 migration for config consolidation
- improve migration ergonomics with common package utilities
- convert deprecated "flat" strategy to "all" during migration
- improve Provide docs

* docs: add total_provide_count metric guidance

- document how to monitor provide success rates via prometheus metrics
- add performance comparison section to changelog
- explain how to evaluate sweep vs legacy provider effectiveness

* fix: add OpenTelemetry meter provider for metrics

- set up meter provider with Prometheus exporter in daemon
- enables metrics from external libs like go-libp2p-kad-dht
- fixes missing total_provide_count_total when SweepEnabled=true
- update docs to reflect actual metric names

---------

Co-authored-by: gammazero <[email protected]>
Co-authored-by: guillaumemichel <[email protected]>
Co-authored-by: Daniel Norman <[email protected]>
Co-authored-by: Hector Sanjuan <[email protected]>
Bumps [github.com/go-viper/mapstructure/v2](https://github.com/go-viper/mapstructure) from 2.2.1 to 2.4.0.
- [Release notes](https://github.com/go-viper/mapstructure/releases)
- [Changelog](https://github.com/go-viper/mapstructure/blob/main/CHANGELOG.md)
- [Commits](go-viper/mapstructure@v2.2.1...v2.4.0)

---
updated-dependencies:
- dependency-name: github.com/go-viper/mapstructure/v2
  dependency-version: 2.4.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix: use CheckIfPinnedWithType for pin ls with names

updates to use CheckIfPinnedWithType method from ipfs/boxo#1035,
enabling efficient pin name retrieval for 'ipfs pin ls <cid> --names'

- uses new CheckIfPinnedWithType from boxo for type-specific pin checks
- pin names are now returned when listing specific CIDs with --names flag

* test: add CLI tests for pin ls with names

tests cover:
- pin ls with specific CIDs returning names
- pin ls without CID listing all pins with names
- pin ls with --type and --names combinations
- JSON output with and without names
- pin update preserving names
- error cases (invalid CID, unpinned CID)

* docs: add pin name improvements to v0.38 changelog

covers fix for ipfs pin ls --names with specific CIDs
and RPC pin name leak fix

* fix(rpc): support pin names in Add()

passes the Name field from PinAddSettings to the API request

adds test to verify pin names work via RPC

* test: add coverage for pin names functionality

- test special characters, unicode, long names
- test concurrent operations
- test persistence across daemon restarts
- test garbage collection preservation
- fix indirect pin test logic

* chore: boxo@main with boxo#1039

* fix(pin): improve pin ls robustness and validation

- add nil check for n.Pinning with early fail-fast validation
- use pin.StringToMode() for consistent type validation
- add edge case tests for invalid types and unpinned CIDs
* fix: prevent --flush=false in 'ipfs files rm' command

the 'ipfs files rm' command always flushes for safety to ensure
data integrity. this change adds an explicit error when users
try to pass --flush=false, improving ux and preventing confusion.

related to #10842

* fix: add MFS cache size limit to prevent unbounded growth

- add Internal.MFSAutoflushThreshold config (experimental)
- directories auto-flush when cache exceeds threshold with --flush=false
- prevents high memory usage issue from #10842
- default: 256 entries per directory (matching HAMT shard size)
- set to 0 to restore old behavior (risky, may cause errors)

Closes #10842
* fix(webui): show helpful errors for incompatible configurations

- show error when Gateway.NoFetch=true and WebUI is not available locally
- show error when Gateway.DeserializedResponses=false (incompatible)
- add tests for both error scenarios

* chore(webui): update to v4.9.0

https://github.com/ipfs/ipfs-webui/releases/tag/v4.9.0

* docs: add WebUI v4.9.0 update to v0.38 changelog

- highlight new diagnostics screen for troubleshooting
- include screenshots of key features in table format
- add local access URL for WebUI
- update TOC with new sections
preserve private use characters as specified
in libp2p/specs#491
enforce 128 rune limit on untrusted peer data
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4 to 5.
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](actions/setup-node@v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-node
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5 to 6.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](actions/setup-go@v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [hadolint/hadolint-action](https://github.com/hadolint/hadolint-action) from 3.1.0 to 3.2.0.
- [Release notes](https://github.com/hadolint/hadolint-action/releases)
- [Changelog](https://github.com/hadolint/hadolint-action/blob/master/.releaserc)
- [Commits](hadolint/hadolint-action@v3.1.0...v3.2.0)

---
updated-dependencies:
- dependency-name: hadolint/hadolint-action
  dependency-version: 3.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5.5.0 to 5.5.1.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@fdcc847...5a10915)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-version: 5.5.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* ci: optimize build workflows

- use go version from go.mod instead of hardcoding
- group platforms by OS for parallel builds
- remove legacy try-build targets

* fix: checkout before setup-go in all workflows

setup-go needs go.mod to be present, so checkout must happen first

* chore: remove deprecated // +build syntax

go 1.17+ uses //go:build, the old syntax is no longer needed

* simplify: remove nofuse tag from CI workflows

- workflows now rely on platform build constraints
- keep make nofuse target for manual builds
- remove unused appveyor.yml

* ci: remove legacy travis variable and fix gateway-conformance

- remove TRAVIS env variable from 4 workflows
- fix gateway-conformance checkout path to match working-directory
- replace deprecated cache-go-action with built-in setup-go caching
* docs: improve slow reprovide warning messages

simplify warning text and provide actionable solutions in order of preference

* feat(config): add validation for Provide.DHT settings

- validate interval doesn't exceed DHT record validity (48h)
- validate worker counts and other parameters are within valid ranges
- improve slow reprovide warning messages to reference config parameter
- add tests for all validation cases

* docs: add reprovide cycle visualization

shows traffic patterns of legacy vs sweep vs accelerated DHT
@lidel lidel changed the base branch from master to release September 19, 2025 18:25
@lidel lidel added the skip/changelog This change does NOT require a changelog entry label Sep 19, 2025
@lidel lidel marked this pull request as ready for review September 19, 2025 18:29
@lidel lidel requested a review from a team as a code owner September 19, 2025 18:29
@lidel
Copy link
Member Author

lidel commented Sep 19, 2025

CI seems to have a hiccup, I'll check on it later. Once it is green will tag.

@lidel lidel changed the title chore: release v0.38.0-rc1 chore: release v0.38.0 Sep 19, 2025
@lidel lidel marked this pull request as draft September 19, 2025 22:32
dependabot bot and others added 8 commits September 27, 2025 03:21
Bumps [hadolint/hadolint-action](https://github.com/hadolint/hadolint-action) from 3.2.0 to 3.3.0.
- [Release notes](https://github.com/hadolint/hadolint-action/releases)
- [Changelog](https://github.com/hadolint/hadolint-action/blob/master/.releaserc)
- [Commits](hadolint/hadolint-action@v3.2.0...v3.3.0)

---
updated-dependencies:
- dependency-name: hadolint/hadolint-action
  dependency-version: 3.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 46d438f)
* fix: SweepingProvider slow start #10979

* don't purge keystore

* feat: add INFO logging for provider keystore sync

log start/completion of async keystore sync with strategy

---------

Co-authored-by: Marcin Rataj <[email protected]>
(cherry picked from commit 1e9b6fb)
adds validation to ensure pin names don't exceed 255 bytes across all
commands that accept pin names. this prevents issues with filesystem
limitations and improves compatibility.

affected commands:
- ipfs pin add --name
- ipfs add --pin-name
- ipfs pin ls --name (filter)
- ipfs pin remote add --name
- ipfs pin remote ls --name (filter)
- ipfs pin remote rm --name (filter)

(cherry picked from commit 1107ac4)
* Filestore: provide Filestore nodes

When strategy is set to "all" (the blockstore does all the providing when a
block is written), no providing was happening to Filestore blocks that were
not written to the underlying blockstore (so, the DAG leaves, as they live in
the filesystem directly). This fixes that.

* docs: clarify filestore and urlstore fix in changelog

both filestore (local file references) and urlstore (HTTP/HTTPS URL
references) blocks are now properly provided shortly after initial add

(cherry picked from commit f63887a)
* fix: add MFS operation limit for --flush=false

adds a global counter that tracks consecutive MFS operations performed
with --flush=false and fails with clear error after limit is reached.

this prevents unbounded memory growth while avoiding the data corruption
risks of auto-flushing.

- adds Internal.MFSNoFlushLimit config
- operations fail with actionable error at limit
- counter resets on successful flush or any --flush=true operation
- operations with --flush=true reset and don't count

this commit removes automatic flush from #10971
and instead errors to encourage users of --flush=false to develop a habit
of calling 'ipfs files flush' periodically.

boxo will no longer auto-flush (ipfs/boxo#1041) to
avoid corruption issues, and kubo applies the limit to 'ipfs files' commands
instead.

closes #10842

* test: add tests for MFSNoFlushLimit

tests verify the new Internal.MFSNoFlushLimit config option:
- default limit of 256 operations
- custom limit configuration
- counter reset on flush=true
- counter reset on explicit flush command
- limit=0 disables the feature
- multiple MFS command types count towards limit

* docs: explain why MFS operations fail instead of auto-flushing

addresses feedback from #10985 (review)

- clarify that automatic flushing at limit was considered but rejected
- explain the data corruption risks of auto-flushing
- guide users who want auto-flush to use --flush=true (default)
- document benefits of explicit failure for batch operations

(cherry picked from commit a688b7e)
Co-authored-by: Marcin Rataj <[email protected]>
(cherry picked from commit 776c21a)
- update boxo to v0.34.1-0.20250926171300-4c0aa3a121fb
- update go-libp2p-kad-dht to v0.34.1-0.20250926161957-861573b39723
- update changelog to reference webui v4.9.1

(cherry picked from commit 13fbb76)
@lidel lidel mentioned this pull request Sep 27, 2025
43 tasks
lidel and others added 7 commits September 27, 2025 03:50
(cherry picked from commit a86df5f)
* Upgrade to Boxo v0.35.0

(cherry picked from commit a7ce33c)
Add recommentation for worker count for the sweeping provide system for
users with millions of CIDs.

(cherry picked from commit cf8194a)
* chore: bump go-libp2p-kad-dht

* docs: update dependency versions in changelogs

- update go-libp2p-kad-dht to v0.35.0 release link in v0.38
- move go-ds-pebble v0.5.2 entry to v0.39 changelog

---------

Co-authored-by: Marcin Rataj <[email protected]>
(cherry picked from commit 42a4935)
@lidel lidel marked this pull request as ready for review October 1, 2025 17:59
@lidel lidel merged commit 34debcb into release Oct 1, 2025
14 checks passed
@lidel lidel deleted the release-v0.38.0 branch October 1, 2025 18:14
@lidel lidel restored the release-v0.38.0 branch October 1, 2025 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip/changelog This change does NOT require a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants