Skip to content

Conversation

arsaboo
Copy link
Contributor

@arsaboo arsaboo commented Oct 2, 2025

Fixes #6036

  • Changelog. (Add an entry to docs/changelog.rst to the bottom of one of the lists near the top of the document.)

@arsaboo arsaboo requested a review from a team as a code owner October 2, 2025 12:25
@Copilot Copilot AI review requested due to automatic review settings October 2, 2025 12:25
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses issue #6036 by ensuring that flexible MusicBrainz metadata fields are refreshed during reimport operations, allowing format changes to be properly applied.

  • Adds specific MusicBrainz fields to the reimport fresh fields lists
  • Implements automatic extension of reimport fresh fields during plugin initialization
  • Updates changelog to document the bug fix

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
docs/changelog.rst Documents the bug fix for refreshing flexible MusicBrainz metadata on reimport
beetsplug/musicbrainz.py Implements the core functionality to extend reimport fresh fields with MusicBrainz-specific fields

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a different approach - this is not a good idea

"""Ensure MusicBrainz fields stored as flex attrs refresh on reimport."""
for field in _MB_REIMPORT_FRESH_FIELDS_ALBUM:
if field not in REIMPORT_FRESH_FIELDS_ALBUM:
REIMPORT_FRESH_FIELDS_ALBUM.append(field)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adjusts definitions of these lists that are defined as constants which is a nightmare to debug.

If someone attempts to understand/debug this functionality in the future, they will see

# beets/importer/tasks.py

REIMPORT_FRESH_FIELDS_ALBUM = [
    "data_source",
    "bandcamp_album_id",
    "spotify_album_id",
    "deezer_album_id",
    "beatport_album_id",
    "tidal_album_id",
]

And they will be left scratching their heads asking why beets adjusts some additional fields not present in this list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha...I am still scratching my head about this convoluted logic. Reimport should update everything.


_MB_REIMPORT_FRESH_FIELDS_ALBUM = [
"media",
"releasegroup_id",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like both Album and Item have mb_releasegroupid field present in their tables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant by this comment was: these fields are in the DB for each relation, so they can't be flexible attributes, I think?

@semohr
Copy link
Contributor

semohr commented Oct 4, 2025

Do we want to rewrite this now? Sadly the current logic is how it was intended for some reason originally.

It alsso seems a bit strange to me that we have to define these field in the first place. An opt out list would be more appropriet.

We should also check why and when this was introduced.

@snejus
Copy link
Member

snejus commented Oct 4, 2025

We can (and should) simply extend those list constants explicitly with new fields. They aren't specific to musicbrainz only - for example, beetcamp returns both data_url and media.

@JOJ0
Copy link
Member

JOJ0 commented Oct 4, 2025

Do we want to rewrite this now? Sadly the current logic is how it was intended for some reason originally.

It alsso seems a bit strange to me that we have to define these field in the first place. An opt out list would be more appropriet.

AFAIR this is an opt out list: Everything should always be refreshed on a reimport, except not for those fields (because it would be just wrong for them)

We should also check why and when this was introduced.

I introduced this a couple of years ago. Not much time right now to reread what was all the reasons back then but I at least looked up the PR for you guys :-) #4735

@snejus
Copy link
Member

snejus commented Oct 4, 2025

I think I see this logic in play in my importlog:

Reimported album 9355. Not preserving flexible attributes ['data_source']. Path: /run/media/sarunas/music/Music/soma/uun
Reimported album 9355. Preserving attribute ['added']. Path: /run/media/sarunas/music/Music/soma/uun
Reimported album 9355. Preserving flexible attributes ['art_source']. Path: /run/media/sarunas/music/Music/soma/uun
Reimported item 43942. Preserving attribute ['added']. Path: /run/media/sarunas/music/Music/soma/uun/02_crueltyisnothingnew.flac
Reimported item 43942. Not preserving flexible attributes ['data_source']. Path: /run/media/sarunas/music/Music/soma/uun/02_crueltyisnothingnew.flac
Reimported item 43942. Preserving flexible attributes ['lastfm_play_count', 'bandcamp_artist_id', 'art_source', 'bandcamp_album_id', 'times_bought', 'rating', 'artpath', 'price', 'helicopta', 'play_count', 'art_url', 'bandcamp_track_id', 'currency', 'last_played']. Path: /run/media/sarunas/music/Music/soma/uun/02_crueltyisnothingnew.flac
Reimported item 43943. Preserving attribute ['added']. Path: /run/media/sarunas/music/Music/soma/uun/03_givethemrope.flac
Reimported item 43943. Not preserving flexible attributes ['data_source']. Path: /run/media/sarunas/music/Music/soma/uun/03_givethemrope.flac
Reimported item 43943. Preserving flexible attributes ['lastfm_play_count', 'skip_count', 'bandcamp_artist_id', 'art_source', 'bandcamp_album_id', 'times_bought', 'rating', 'artpath', 'price', 'play_count', 'art_url', 'bandcamp_track_id', 'currency', 'last_played']. Path: /run/media/sarunas/music/Music/soma/uun/03_givethemrope.flac

@JOJ0
Copy link
Member

JOJ0 commented Oct 4, 2025

We can (and should) simply extend those list constants explicitly with new fields. They aren't specific to musicbrainz only - for example, beetcamp returns both data_url and media.

Explicitly! Yes please! Keep it readable. If there would be something that is absolutely plugin specific, we could simply note it in a comment?

@JOJ0
Copy link
Member

JOJ0 commented Oct 4, 2025

I think I see this logic in play in my importlog:

exactly! this is it!

@snejus
Copy link
Member

snejus commented Oct 4, 2025

For now, I think, we can just extend those lists and merge this PR. Longer term, we need to find why flexible attribute update logic differs from db attribute update.

DB attributes are always updated with values other than None. None indicates absence of a field, and thus this field is ignored in updates. This can be changed by overwrite_null.album and overwrite_null.track configuration: when a field is present in this list, None means empty and the previous value is cleared on updates.

@arsaboo
Copy link
Contributor Author

arsaboo commented Oct 5, 2025

Here's another discussion on this topic #4788

So, how do we want to resolve this PR? I am happy if one of you would like to take this forward.

@arsaboo
Copy link
Contributor Author

arsaboo commented Oct 6, 2025

@snejus, I have updated the list for now. We can merge this for now and have a separate discussion about the changes required to fix the underlying issue. It may also not be a bad idea for you to fix this in #6052
@semohr I can create a separate PR for #6076 once this one is done (so I know what is needed)

Copy link
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these changes from musicbrainz.py and add the fields to the lists in importer.tasks instead.

@arsaboo
Copy link
Contributor Author

arsaboo commented Oct 7, 2025

@snejus check now.

This may be beyond the scope of the current PR, but I assume that title, album, albumartist, etc. are still not updated on reimport. I don't see any reason why that should be the case. Like you said

Longer term, we need to find why flexible attribute update logic differs from db attribute update.

@snejus
Copy link
Member

snejus commented Oct 7, 2025

@arsaboo it's impossible to have a flexible attribute named after any of the database fields. If you try to set item.title = "new_title", you will update the database field item.title. See the following logic under class Model:

    def _setitem(self, key, value):
        """Assign the value for a field, return whether new and old value
        differ.
        """
        # Choose where to place the value.
        if key in self._fields:
            source = self._values_fixed
        else:
            source = self._values_flex

_values_fixed property holds database fields like album, title, albumartist, artist etc., while _values_flex contains everything else.

"beatport_album_id",
"tidal_album_id",
"media",
"releasegroup_id",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What prompted you to add this field, actually?

I don't have a single item or an album with such flexible attribute:

$ sqlite3 ~/.music/beets/library.db "select * from album_attributes where key == 'releasegroup_id'"
$ sqlite3 ~/.music/beets/library.db "select * from item_attributes where key == 'releasegroup_id'"

And I can see it is saved as item.mb_releasegroupid field in the database:

beets/autotag/__init__.py-313-        item.mb_albumartistid = album_info.artist_id
beets/autotag/__init__.py-314-        item.mb_albumartistids = album_info.artists_ids
beets/autotag/__init__.py:315:        item.mb_releasegroupid = album_info.releasegroup_id
beets/autotag/__init__.py-316-
beets/autotag/__init__.py-317-        # Compilation flag.

Copy link
Contributor Author

@arsaboo arsaboo Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was internally referred to as releasegroup_id. In the MusicBrainz plugin and in autotag/hooks.py, the field is called releasegroup_id; in the library models, it's stored as mb_releasegroupid; in the reimport fresh fields, it's using releasegroup_id (the internal field name used during the matching process)

So the field name releasegroup_id in REIMPORT_FRESH_FIELDS_ITEM maps to mb_releasegroupid in the database during the import process (as shown in line 315 of autotag/init.py: item.mb_releasegroupid = album_info.releasegroup_id).

BTW, these are some of the issues I hope you can resolve in your other PR #6052

…M and REIMPORT_FRESH_FIELDS_ALBUM definitions
@semohr
Copy link
Contributor

semohr commented Oct 8, 2025

Longer term, we need to find why flexible attribute update logic differs from db attribute update.

Before merging, could you please create an issue summarizing the situation, what’s different, and how or why we might want to change it? That way we can keep track of it and organize our thoughts for future work. 😉

@arsaboo
Copy link
Contributor Author

arsaboo commented Oct 8, 2025

Longer term, we need to find why flexible attribute update logic differs from db attribute update.

Before merging, could you please create an issue summarizing the situation, what’s different, and how or why we might want to change it? That way we can keep track of it and organize our thoughts for future work. 😉

HAHA....I wish I had that clarity 🤣

I honestly did not even completely follow @snejus comment. @JOJ0 and @snejus are the closest to this, and I will let them summarize and resolve this in a subsequent PR.

@semohr
Copy link
Contributor

semohr commented Oct 8, 2025

I kinda lost the sauce too. That's why im asking :D

@snejus
Copy link
Member

snejus commented Oct 8, 2025

I often fail to communicate properly... I hope that this helps, guys, see: https://www.perplexity.ai/search/can-you-explain-what-is-snejus-eO3.NbE8TM.d.1FOARnPqA

@arsaboo
Copy link
Contributor Author

arsaboo commented Oct 8, 2025

I often fail to communicate properly... I hope that this helps, guys, see: https://www.perplexity.ai/search/can-you-explain-what-is-snejus-eO3.NbE8TM.d.1FOARnPqA

It was not your explanation, but my lack of understanding. This helps. To summarize (and for posterity),

Database Fields vs. Flexible Attributes: Beets treats two types of fields differently:

Database Fields (_values_fixed):

  • These are predefined schema fields like title, album, albumartist, artist, etc.
  • Stored directly in the SQLite database tables as columns
  • Always updated during reimport when new values are found (unless the value is None)
  • Updates are controlled by the overwrite_null configuration for handling None values

Flexible Attributes (_values_flex)

  • Custom fields that users can add (like mood, play_count, etc.)
  • Stored in separate key-value tables (album_attributes and item_attributes)
  • Follow different update logic during reimport operations
  • Subject to the REIMPORT_FRESH_FIELDS lists that determine which flexible attributes get refreshed

The REIMPORT_FRESH_FIELDS lists act as an allowlist - only flexible attributes in these lists get updated during reimport, while all others are preserved.

Let me know if anything else is required in this PR.

@snejus
Copy link
Member

snejus commented Oct 8, 2025

There's no need to add releasegroup_id - see the last section in https://www.perplexity.ai/search/can-you-explain-what-is-snejus-eO3.NbE8TM.d.1FOARnPqA#2.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

import: Media type won't get updated when retagging album
4 participants