Skip to content

lastgenre: Genre ignorelist#6449

Open
JOJ0 wants to merge 10 commits intomasterfrom
lastgenre_forbidden
Open

lastgenre: Genre ignorelist#6449
JOJ0 wants to merge 10 commits intomasterfrom
lastgenre_forbidden

Conversation

@JOJ0
Copy link
Copy Markdown
Member

@JOJ0 JOJ0 commented Mar 19, 2026

Description

Adds a global and artist-specific genre ignorelist to lastgenre. Ignorelist entries can use regex patterns or literal genre names and are configurable per artist or globally. For config examples see submitted docs and _load_ignorelist() docstring.

Additional minor refactoring

  • Fixed condition in "keep original fallback stage" to use config view object directly via .get().
  • Deduplicate finding the correct artist/albumartist attribute with a helper _artist_for_helper

Further notes

My personal roadmap for the lastgenre plugin: #5915

Feature idea originated from: #5721 (comment).

To Do

  • Documentation.
  • Changelog.
  • Tests.

@JOJ0 JOJ0 requested a review from a team as a code owner March 19, 2026 22:03
Copilot AI review requested due to automatic review settings March 19, 2026 22:03
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

grug see PR add genre ignorelist to lastgenre plugin, so wrong Last.fm tags can be blocked global (*) or per artist. this fit plugin job: pick good genre tags and keep user control.

Changes:

  • add ignorelist file parsing + regex/literal compile, plus shared is_ignored() helper
  • apply ignorelist both right after Last.fm fetch and again during genre resolve/whitelist/c14n
  • add docs + tests for ignorelist matching and file format/errors

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
beetsplug/lastgenre/__init__.py load ignorelist from config file; thread artist through resolve path; filter ignored tags during c14n + final validation
beetsplug/lastgenre/client.py filter ignored tags immediately after Last.fm lookup (while keeping cache)
beetsplug/lastgenre/utils.py new shared is_ignored() helper + ignorelist typing
docs/plugins/lastgenre.rst document ignorelist feature + config option
test/plugins/test_lastgenre.py add ignorelist tests (resolve behavior, matcher behavior, file format + errors)
Comments suppressed due to low confidence (1)

test/plugins/test_lastgenre.py:293

  • grug see this config fixture just return upstream module-scoped config. it not reset state per test, so config['lastgenre']['ignorelist'] (and other keys) can leak between tests and even break LastGenrePlugin() init when prior test set ignorelist path then delete file. grug suggest make this fixture actually reset needed keys (ex: set lastgenre.ignorelist = False here) or use function-scoped fresh config object.
@pytest.fixture
def config(config):
    """Provide a fresh beets configuration for every test/parameterize call

    This is necessary to prevent the following parameterized test to bleed
    config test state in between test cases.
    """
    return config

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 1400171 to c17eebb Compare March 19, 2026 22:16
@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 19, 2026

@snejus this is now ready for a first (human) review except the copilot comments I will address. Thanks in advance :-)

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 84.93151% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.19%. Comparing base (e536514) to head (46bde5e).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/lastgenre/__init__.py 87.50% 4 Missing and 2 partials ⚠️
beetsplug/lastgenre/client.py 37.50% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6449      +/-   ##
==========================================
+ Coverage   70.05%   70.19%   +0.13%     
==========================================
  Files         147      148       +1     
  Lines       18560    18612      +52     
  Branches     3026     3035       +9     
==========================================
+ Hits        13003    13064      +61     
+ Misses       4924     4913      -11     
- Partials      633      635       +2     
Files with missing lines Coverage Δ
beetsplug/lastgenre/utils.py 100.00% <100.00%> (ø)
beetsplug/lastgenre/client.py 55.10% <37.50%> (+1.91%) ⬆️
beetsplug/lastgenre/__init__.py 81.67% <87.50%> (+6.13%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@snejus
Copy link
Copy Markdown
Member

snejus commented Mar 21, 2026

Just had a quick look at the description and wanted to suggest that ideally we want to stay within the beets configuration YAML for the regex patterns. See my configuration: #6269 (comment), specifically importreplace plugin configuration and bandcamp.field_patterns which use regex patterns heavily.

What specific issue YAML poses in your case?

@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 21, 2026

Cant recall specifically, it was a year ago already. Loads of loads of excaping this and that. regexes bacame tedious to write and even read. Will take a look at how you implemented regex patterns within yaml.

What oposes not a separate format? whitelist and c14n is something else anyway.

I'd say it's simpler to use. Main reason. I gave up back then trying to use ini or yaml for regexes. It was simply painful to use 🤷

this new format I will use as-is for genre aliases too btw. also a reason why I think it's the best way forward for the project.

@snejus
Copy link
Copy Markdown
Member

snejus commented Mar 21, 2026

Cant recall specifically, it was a year ago already. Loads of loads of excaping this and that. regexes bacame tedious to write and even read. Will take a look at how you implemented regex patterns within yaml.

What oposes not a separate format? whitelist and c14n is something else anyway.

I'd say it's simpler to use. Main reason. I gave up back then trying to use ini or yaml for regexes. It was simply painful to use 🤷

this new format I will use as-is for genre aliases too btw. also a reason why I think it's the best way forward for the project.

Personally it's about the convenience - I want all my configuration to live in a single file. And if it's getting unwieldy, I can move, say, lastgenre configuration to a separate file.

For example, my importreplace configuration is huge, so it lives in a separate file:

directory: /mnt/music/Music
library: ~/.music/beets/library.db
include:
  - replacements.yaml

@snejus
Copy link
Copy Markdown
Member

snejus commented Mar 21, 2026

Also I don't see how a separate file improve the regex definitions. Note that yaml is extremely flexible in this regarding, one just needs to be aware of its features. For example, regexes do not need to be defined in a single line:

    artwork:
      pattern: |
        \b(?i:art(?:work)?|cover)\b
        [^|:,.\n-]*?
        (?:[-:]\ ?|by:?\ )
        (
            https:[^ ]+
          | (
            [\w@]
            (?:[^|:,.\n-]|[.]\w)+
            [^|:,.\n -]
          )+
        )
        ([|:,.\n \"-]|$)

Then when you read such configuration in python, make sure to specify re.VERBOSE flag: re.compile(config["regex"].get(), re.VERBOSE) to handle new lines.

@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 22, 2026

Please don't get me wrong: I understand that yaml is our config format and my first versions of this feature when I started about a year ago was of course trying to make it work with yaml.

And yes I am certainly aware of yaml includes and I do use them too :-)

Problems I had were mainly escaping issues, yaml treating regexes that work elsewhere in python do not work when put into yaml. I went to escaping hell and back. I also tried ini as an alternative and that was a little better but still painful.

At some point I decided that it is not worth ot and won't ever be userfriendly, so I put thought and effort into finding the most simple solution from a user perspective.

Open file - add regex from regex101 or self constructed - see if it compiles - and done. No special treatment so yaml accepts it.

That is user friendly and overrules "having everything in one file", which you say for yourself you don't do anyway when stuff gets too big. For last.fm genre ignores, believe me you will have a separate file because it gets big. So your argument is not too strong here.

Anyway I'm not at home/at computer these days and I won't argue further here, so if you insist that yaml is the better way to go, I will do it. It will take some time and it's a bit of a bummer that sets me back from finally getting this over with... If you help me with escaping my regexes for yaml I will be grateful

I see weird things in your config, what is this for example?

"\u2019"
"[\xE2\xA6\xF0\xCB]"
"Chan\xE9"
"Sz\xE1hala"
"Enk\u014D"
"BEHE\u0100DER"

@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 22, 2026

Also I don't see how a separate file improve the regex definitions. Note that yaml is extremely flexible in this regarding, one just needs to be aware of its features. For example, regexes do not need to be defined in a single line:

    artwork:

      pattern: |

        \b(?i:art(?:work)?|cover)\b

        [^|:,.\n-]*?

        (?:[-:]\ ?|by:?\ )

        (

            https:[^ ]+

          | (

            [\w@]

            (?:[^|:,.\n-]|[.]\w)+

            [^|:,.\n -]

          )+

        )

        ([|:,.\n \"-]|$)

Then when you read such configuration in python, make sure to specify re.VERBOSE flag: re.compile(config["regex"].get(), re.VERBOSE) to handle new lines.

multiline regexes dont solve the escaping problems, also I don't really need them for my usecase. What are you trying to say here?

@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 22, 2026

And just so we know what we are actually talking about, this format is really straight forward and a pleasure to read and edit. Not even any quoting required ❤️

alaska:
    ^ska$
    post-hardcore
alves:
    .*metal.*
    .*rock.*
bleak:
    Black Metal
    Dark Ambient
Bob Marley & The Wailers:
    ska
cleric:
    Grindcore
    Punk Rock
dj fresh:
    rhythm and blues
ed rush:
    rhythm and blues
fallen angels:
    Power Metal
    Heavy Metal
    Progressive Metal
    Alternative Rock
    Mathcore
    Metalcore
    Death Metal
    Rock
    grindcore
    death metal
    Thrash Metal
    Psychedelic Rock
    Hard Rock
fierce:
    rhythm and blues
fracture:
    progressive metal
    ^.*(heavy|black|power|death)?\s?(metal|rock)$|\w+-metal\d*$
    punk rock
    post-punk
    dubstep
    industrial
    electronica
heavy weather:
    screamo
    emo
    Stoner Metal
    Thrash Metal
herbie hancock:
    turntablism
    .*rock.*
moby:
    alternative rock
    ^ambient$
mr. scruff:
    ^jazz$
neptune:
    ^.*(heavy|black|power|death)?\s?(metal|rock)$|\w+-metal\d*$
nine:
    hip hop
    rap
paradox:
    Thrash Metal
    Speed Metal
    ^ska$
phobia:
    .*grind.*
    .*crust.*
    .*metal.*
    .*rock.*
placebo:
    .*alternative.*
    .*rock.*
point:
    pop
polar:
    metalcore
    ^hardcore$
regal:
    Garage Rock
    Funk
seba:
    ^ska$
sfr:
    liquid funk
shed:
    techno
    hip-hop
    dub
    rap
    Alternative Rock
smokey joe:
    ska
    punk
soft machine:
    Progressive Rock
solar:
    metal
    k-pop
st germain:
    .*metal.*
    .*rock.*
team scheisse:
    ^rock$
timecode:
    rhythm and blues
    ^.*metal.*$
the horsemen present:
    .*grind.*
    .*crust.*
    .*metal.*
    .*rock.*
vril:
    screamo
    emo
    black metal
    heavy metal
various::
    emo
    post-hardcore
various artists:
    emo
    post-hardcore
xdb:
    dub
    ska
*:
    ^electronic$
    ^rock$
    ^heavy metal$
    ^experimental$
    ^electronica$
    ^trance$

@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 22, 2026

To further give you the full picture of my reasoning: I took quite some ideas for the aliases and iignores features from discontinued whatlastgwnre 3rd party plugin. It had curated lists of very useful patterns for the aliases features.

They worked well for me back then when I used the plugin regularly, so I wanted to continue use them and also include them in my lastgenre implementation of these feature ideas.

I still plan to ship a curated list of aliases with beets! Stuff that works for everyone. Shipping in a sep file is easy.

I tried hard to make them work in yaml and failed. It was a never ending story escaping everything so yaml accepts it:

# tags with ampersands
d(rum)?[ n/]*b(ass)? = drum and bass
drill[ n/]*bass = drill & bass
hard[ n/]*heavy = hard & heavy
r(hythm)?[ n/]*b(lues)? = rhythm & blues
rock[ n/]*roll = rock & roll
stage[ n/]*screen = stage & screen
# consistent delimiters
(c|k|j)[ /]*(folk|goth|hip hop|pop|rock|ska|trance) = \g<1>-\g<2>
(euro(?!p[ae]+n?)|neo|post|tech(?!n[io]))[ /]*(\w+) = \g<1>-\g<2>
(g?lo)[ /]*fi = \g<1>-fi
(glitch|hip|jazz|trip)y?([ /]*hip)?[ /]*hop = \g<1>\shop
(p|g)[ /]*funk = \g<1>-funk
nu[/-]*(disco|jazz|metal|soul) = nu \g<1>
synth[ /-]+(\w+) = synth\g<1>
# abbreviation related
alt(/|er*n(/|ati[fv](es?)?))? = alternative
avant(/|gard)? = avantgarde
# ele[ck]tr(i[ck]|o(ni[ck][as]*)?) = electronic
goth(?!ic) = gothic
prog+(/|r*es+(/|i[fv]e?)?)? = progressive
ps?y(/|ch(/|[aeo]l?del+i[ca]+)?)? = psychedelic
trad(/|ition(/|al)?)?-? = traditional
# other
#(cabaret|comedi(an|e)|humou?r|kabarett|parod(ie|y)) = comedy
#^(film|games?|movies?|t(ele)?v(ision)?|video(s| )?games?) ?(scores?|music)? = soundtrack
#g?old(i(es?)?|en)? = oldies
(8[ /-]*bit|(8[ /-]*)?bit[ /-]*pop|chip([ /-]*tunes?)?) = chiptune
(fem|m)ale[ /]*vo(cal(ist)?|ice)s? = \g<1>ale vocalist
(ost|vgm|scores?|video[ /]*game[ /]*music) = soundtrack
.*top.*[0-9]+.* = charts
best.*of (\w+) = \g<1>
br(eaks?|oken)([ /]*beats?)? = breakbeat
down[ /-]*(beat|tempo?) = downtempo
minimal[ /-]*(electronic|house|techno) = minimal
shoegaze(r|ing?) = shoegaze

I'm giving you all this so you might better understand that I did my due diligence, made my homework and had a long and thoughtful path until I decided that a custom format is the best and most of all userfriendly way.

If you still insist that yaml is the best way I will start doing it. But please do note that I put a lot of thought, work and experimentation into my decisions and it's not something I just decided out of the blue.

@snejus
Copy link
Copy Markdown
Member

snejus commented Mar 22, 2026

multiline regexes dont solve the escaping problems, also I don't really need them for my usecase. What are you trying to say here?

I want to understand this better. Can you give me an example where this is an issue?

@JOJ0 JOJ0 added lastgenre plugin Pull requests that are plugins related labels Mar 27, 2026
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 2 times, most recently from 66a7d98 to 1785dab Compare March 28, 2026 23:12
@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 28, 2026

Hi again @snejus, I refactored this to have everything right within the beets config. I added some details to docs and docstrings user's should be cautious about when putting there regex into yaml.

I won't be able to recall excatly errors from when I first started with this feature. In the end it doesnt matter. I can only suppose that I might have tried to put regex into double quotes which is a very bad idea. Also I might have had a lot of errors with ini as well and all that added up to the conclusion that yaml and ini are bad -> Invent something that just works and most of all is more user friendly. I still find my first version the most user friendly and it didn't require any "caution don't do this, don't do that" stuff in the docs ;-) But I now learned more about it and it is not too much to worry about. In most cases with regex in yaml it is just best to not use any single or double quotes at all and most stuff will work.

So @snejus the 3 things I'd appreciate if you took special attention to in another review are:

  • How the patterns are loaded from the config. Is that the best way?

  • Is the additinoal docs in the Attention box absolutely correct?

  • Let's re-review in what places the ignorelist should be applied. I've ran over this dozens of times already and it is simply not just one place. In my opinion it should be 3 places:

    • directly after fetching new genres
    • directly after reading existing genres (but we don't actually do that, it happens in resolve_genre, similar to whitelist checks)
    • when handling ancestors during canonicalization

Currently existing genres check happens in resolve_genre, and it is applied possibly more than once. I'm just running through that again to see if there is a redundant check in there.

@snejus
Copy link
Copy Markdown
Member

snejus commented Mar 31, 2026

best to not use any single or double quotes at all and most stuff will work.

You can use them as long as you use yaml literal (or scalar) style blocks:

configuration: |
  you can use
  any
  characters
  '"
  you like

Will review this tomorrow!

Copy link
Copy Markdown
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of comments and will have another look once these are addressed!

@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Apr 1, 2026

best to not use any single or double quotes at all and most stuff will work.

You can use them as long as you use yaml literal (or scalar) style blocks:

configuration: |
  you can use
  any
  characters
  '"
  you like

Will review this tomorrow!

You misunderstood and I worded poorly: I mean: It is actually quite good / best to not try to use any "" or '' for trying to escape something. Leaving them out completely in one-line-yaml-entries works surprisingly well and that is also what I'd like to transport to users via the docs: Best not use them for escaping :-)

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 9d13053 to 4731a1c Compare April 1, 2026 18:25
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 4 times, most recently from ba99310 to 1cfe0c9 Compare April 2, 2026 17:42
@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Apr 2, 2026

Hi @snejus thanks for the feedback, I tried to address it. I think it's time for another look. I'm not happy with all of it yet but maybe need guidance if I'm on the right track.

Delegating config check to confuse

I have two versions:

This first version in 636aca7 throws a weird error when an artist: [] entry actually is a number (which I suppose will happen in practice, there is numerical artist names out there), so if a user puts in eg. 123: [] they'd get a misleading error:

confuse.exceptions.ConfigTypeError: lastgenre.ignorelist must be a collection, not bool

It's hard to guess what the problem is for a user with that error so I tried to handle that here with a slightly more complex version: 1cfe0c9

Generally, I am not sure if catching those errors and then raising a UserError is the best way. I get huge tracebacks of the type "this exception is the cause of another exception". Since these errors could help a user a lot when developing there ingorelists, I'd like to ask what the best way to catch those confuse errors and simply print them. I know I can catch them and log.info or something, but beets will move on in that case. What is the recommended way to close beets in those case. I was afraid of simply adding a SystemExit(1) or something. I was a little lazy with looking through the codebase to find an example where we clearly end beets because it is a fatal issue.

I just realized that `UserError" alsod doesnt actually exit beets but that's what I actually want to prevent "surprising" results since the error message gets washed away quickly by other beets output!

After all I'd like to be consisten with how behave elsewhere. If that means: huge tracebacks and a tiny error message that is relevant, I'm ok with it.

Version 1 error looks like this:

$ beet -vvv lastgenre -p -f -k -s album -A fracture buzzing
user configuration: /Users/jojo/.config/devbeets/config.yaml
data directory: /Users/jojo/.config/devbeets
plugin paths: []
Loading plugins: autobpm, convert, describe, dirfields, discogs, duplicates, edit, fromfilename, importfeeds, info, lastgenre, mbsync, missing, play, playlist, vibenet
lastgenre: Loading whitelist ~/git/home_config/lastgenre_conf/genres.txt
lastgenre: Loading canonicalization tree ~/git/home_config/lastgenre_conf/genres-tree.yaml
** error loading plugin lastgenre
Traceback (most recent call last):
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/core.py", line 468, in resolve
    value = collection[self.key]  # type: ignore[index]
            ~~~~~~~~~~^^^^^^^^^^
TypeError: 'bool' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jojo/git/beets/beets/plugins.py", line 470, in _get_plugin
    return obj()
           ^^^^^
  File "/Users/jojo/git/beets/beetsplug/lastgenre/__init__.py", line 145, in __init__
    self.setup()
  File "/Users/jojo/git/beets/beetsplug/lastgenre/__init__.py", line 155, in setup
    self.ignorelist: Ignorelist = self._load_ignorelist()
                                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/git/beets/beetsplug/lastgenre/__init__.py", line 229, in _load_ignorelist
    validated_config = self.config["ignorelist"].get(template)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/core.py", line 317, in get
    return templates.as_template(template).value(self, template)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/templates.py", line 197, in value
    {k: v.value(view[k], self) for k, v in self.subtemplates.items()}
        ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/templates.py", line 232, in value
    for item in view.sequence():
                ^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/core.py", line 243, in sequence
    collection, _ = self.first()
                    ^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/core.py", line 93, in first
    return util.iter_first(pairs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/util.py", line 29, in iter_first
    return next(it)
           ^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/core.py", line 477, in resolve
    raise ConfigTypeError(
confuse.exceptions.ConfigTypeError: lastgenre.ignorelist must be a collection, not bool
Sending event: pluginload
library database: /Users/jojo/.config/devbeets/library.db
library directory: /Users/jojo/Music/devbeets
Sending event: library_opened
error: unknown command 'lastgenre'

Version 2 now accepts a numeric artist but with a config like this would clearly notify about the actual issue:

        123:
            - #dub
            - ska

$ beet -vvv lastgenre -p -f -k -s album -A fracture buzzing
user configuration: /Users/jojo/.config/devbeets/config.yaml
data directory: /Users/jojo/.config/devbeets
plugin paths: []
Loading plugins: autobpm, convert, describe, dirfields, discogs, duplicates, edit, fromfilename, importfeeds, info, lastgenre, mbsync, missing, play, playlist, vibenet
lastgenre: Loading whitelist ~/git/home_config/lastgenre_conf/genres.txt
lastgenre: Loading canonicalization tree ~/git/home_config/lastgenre_conf/genres-tree.yaml
** error loading plugin lastgenre
Traceback (most recent call last):
  File "/Users/jojo/git/beets/beetsplug/lastgenre/__init__.py", line 230, in _load_ignorelist
    ].get(confuse.Sequence(str))
      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/core.py", line 317, in get
    return templates.as_template(template).value(self, template)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/templates.py", line 233, in value
    out.append(self.subtemplate.value(item, self))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/templates.py", line 96, in value
    return self.convert(value, view)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/templates.py", line 305, in convert
    self.fail("must be a string", view, True)
  File "/Users/jojo/.pyenv/versions/3.12.7/envs/beets312/lib/python3.12/site-packages/confuse/templates.py", line 137, in fail
    raise exc_class(f"{view.name}: {message}")
confuse.exceptions.ConfigTypeError: lastgenre.ignorelist#123#0: must be a string

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/jojo/git/beets/beets/plugins.py", line 470, in _get_plugin
    return obj()
           ^^^^^
  File "/Users/jojo/git/beets/beetsplug/lastgenre/__init__.py", line 146, in __init__
    self.setup()
  File "/Users/jojo/git/beets/beetsplug/lastgenre/__init__.py", line 156, in setup
    self.ignorelist: Ignorelist = self._load_ignorelist()
                                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jojo/git/beets/beetsplug/lastgenre/__init__.py", line 234, in _load_ignorelist
    raise UserError(str(e)) from e
beets.ui.UserError: lastgenre.ignorelist#123#0: must be a string
Sending event: pluginload
library database: /Users/jojo/.config/devbeets/library.db
library directory: /Users/jojo/Music/devbeets
Sending event: library_opened
error: unknown command 'lastgenre'

drop_ignored_genres helper

You'll see it in the diff. is_ignored is called in 5 places, in 2 of them I could deduplicate something. Maybe there is room for improvement.

Tests

Not fixed yet, since I am not sure about the final direction of the confuse error handling.

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 7 times, most recently from 3866f73 to d2b0e00 Compare April 3, 2026 07:36
@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Apr 3, 2026

@snejus ready for another look. Updated PR description to reflect latest state and picked a tiny simplification I discovered in a forthcoming PR (#6474) because it makes a lot of sense here already: 6536937

Also finally fixed tests and even removed one that was redundant. Please check.

@JOJ0 JOJ0 requested a review from snejus April 3, 2026 15:48
JOJ0 added 10 commits April 3, 2026 20:03
- Test file format (valid and error cases)
- Test regex pattern matching (_is_ignored)
- Test _resolve_genres: ignored genres filtered
- Test _resolve_genres: c14n ancestry walk blocked for ignored tags
- Test _resolve_genres: without whitelist, oldest ancestor is kept
- Test _resolve_genres: whithout whitelist, ignored oldest ancestor is
  removed
- Prevents wrong last.fm genres based on a per artist (or global) list of regex
  patterns that should be ignored.
  - Genre _ignoring_ happens in two places: Right after fetching the
    last.fm genre and in _resolve_genres.
  - As a fallback literal string matching can be used instead of
    supplying a regex pattern
- Includes docs for the new feature.
and compile regex inline. extra_debug is now better readable
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 6536937 to 46bde5e Compare April 3, 2026 18:03
Copy link
Copy Markdown
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, added several comments

Comment on lines +357 to +359
genre
for genre in cleaned
if not self.whitelist or genre.lower() in self.whitelist
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the simpler variable g to slightly reduce verbosity here.

Comment on lines +376 to +381
return (
getattr(obj, "artist", None)
if isinstance(obj, library.Item)
else getattr(obj, "albumartist", None)
or getattr(obj, "artist", None)
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be just

Suggested change
return (
getattr(obj, "artist", None)
if isinstance(obj, library.Item)
else getattr(obj, "albumartist", None)
or getattr(obj, "artist", None)
)
return (
obj.artist
if isinstance(obj, library.Item)
else obj.albumartist or obj.artist
)

Comment on lines +38 to +42
return [
genre
for genre in genres
if not is_ignored(logger, ignorelist, genre, artist)
]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return [
genre
for genre in genres
if not is_ignored(logger, ignorelist, genre, artist)
]
return [
g
for g in genres
if not is_ignored(logger, ignorelist, g, artist)
]

Comment on lines +52 to +53
if not ignorelist:
return False
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this

Suggested change
if not ignorelist:
return False


from beets.logging import BeetsLogger

Ignorelist = dict[str, list[re.Pattern[str]]]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name Ignorelist is a bit misleading given the shape of it. It should rather be called IgnoreMap: it maps artists to their respective ignorelists.

Probably worth renaming the configuration to ignoremap as well, potentially even artist_ignoremap.

Filtering is done in two places: when fetching genres from Last.fm and when
resolving to a final genre list (during canonicalization and whitelisting).

This means that existing genres are also filtered when the ``force`` and
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This means that existing genres are also filtered when the ``force`` and
This means that existing genres are also filtered when ``force`` and

Comment on lines +179 to +180
``keep_existing`` options are enabled (or ``cleanup_existing`` is enabled with
``force: no``).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency:

Suggested change
``keep_existing`` options are enabled (or ``cleanup_existing`` is enabled with
``force: no``).
``keep_existing`` options are enabled (or ``cleanup_existing`` is enabled with
``force`` disabled).

key applies globally to all artists — use it to block genres you never want,
regardless of artist. Patterns are matched against the full genre string, so a
plain ``metal`` will not match ``heavy metal`` unless you write a regex like
``.*metal.*``.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
``.*metal.*``.
``.*metal``.


- The global key ``'*'`` **must** be surrounded by single quotes so that
YAML does not interpret it as an anchor.
- Any regex pattern that starts with special YAML characters (especially
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Any regex pattern that starts with special YAML characters (especially
- Any regex pattern that starts with a special YAML character (especially

# Ignorelist pattern matching tests for _is_ignored, independent of _resolve_genres


@pytest.mark.parametrize(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Place these ignorelist tests under a class, e.g. TestIgnoreList

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lastgenre plugin Pull requests that are plugins related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants