-
Notifications
You must be signed in to change notification settings - Fork 432
Fix over-eager file matching in skip files and --file
option
#4670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
--file=a.c should not match a.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find, I think this should be the default behavior that we don't include an extra *
at the end of the regex pattern.
However, as you mentioned, this change could easily break users' skipfiles, so this change should be clearly documented in the upcoming release notes.
I'm pretty anxious about this one. How widely is the trailing |
I have added a clause to add the star to entries that are evidently folders. (filenames ending with '/') |
What if the user specified a directory without a
|
I cannot be sure if the path you gave is for a file named build or for a folder, so I do not handle that case. During a conversation with @HoBoIs, I got the suggestion to, instead of adding a This way, this change is NOT breaking anymore! This would solve the issue of The only consideration to be taken now is: what if the user placed Currently there is a slight issue with the //Z fnmatch.translate places to the end of the user defined regex, but I hope to iron out these issues. |
I have fixed the regex; the issue was fnmatch placing an end of line/end of input. I have moved this to the end of the appended regex. I also added a comment explaining what the regex does, so in the future, it will be easier to decipher. We have been thinking about the windows side of things too. Does the paths in compile_commands.json contain '' or '/' on Windows? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
See my comment about a minor issue.
We should also investigate which slash type is used on Windows.
# https://docs.python.org/3/library/os.path.html#os.path.normpath | ||
rexpr = re.compile( | ||
fnmatch.translate(norm_skip_path + '*')) | ||
fnmatch.translate(norm_skip_path)[:-2] + r"(?:/.*)?\Z") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If fnmatch.translate()
function's implementation changes, e.g. it no longer puts \Z
in the end, then [:-2]
breaks the regex. So instead of [:-2]
consider using rstrip()
Also, is there a reason we are using \Z
in the end of the regex and not $
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a simple check to see if \Z is there, before removing it.
rstrip(r"\Z") would have stripped away any path ending in z.
e.g: "abcz\Z" -> "abc"
I have added a check to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Why:
In the current implementation of the skipfiles, every file declared will match any other file that contains the declared file name at the beginning of its name. An example would be:
simple.c
andsimple.cpp
. Specifyingsimple.c
with the--flag
option would result in a regex:/path/to/simple.c.*
What:
Removed adding '*' to the end of any regex given through
--file
or the skipfile.Notes:
According to the documentation, this change is breaking, but I could see some users declaring directories in their skip file without placing a '*' at the end.
Fixes: #4664