Skip to content

Conversation

hippietrail
Copy link
Collaborator

Issues

Help out with #1832

Description

Looks for idioms and set phrases that should use "too" where instead "to" has been used.
Also one case of the opposite.
Also moved and expanded "spoke to soon" from phrase_corrections to phrase_set_corrections.

I only included those that got more than a few search hits.
For others which got plenty of search hits but also lots of false positives, I included comments so people won't hastily try to add them.

How Has This Been Tested?

I added one unit test for each pattern that's flagged, all or most sourced from GitHub.

Checklist

  • I have performed a self-review of my own code
  • I have added tests to cover my changes

Copy link
Collaborator

@elijah-potter elijah-potter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly ready, I just have the one minor nitpick.

@@ -353,6 +353,49 @@ pub fn lint_group() -> LintGroup {
"Corrects `rise the question` to `raise the question`.",
LintKind::Grammar
),
"ToToo" => (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried this would get confused with the ToTwoToo rule. Maybe IdiomaticToToo?

Copy link
Collaborator Author

@hippietrail hippietrail Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried this would get confused with the ToTwoToo rule. Maybe IdiomaticToToo?

I think there's a tension between names which will be useful for users since they're used in the options, and names which don't conflict for the programmers.

I think we were working on these at the same time. All uses of "to" and "too" are idiomatic. The proper solution is surely to merge them by way of a LintGroup. I think but am not sure that there's not a way to group lints together so that only the name of the group is exposed to user? I looked into that a week ago but forget.

(But see my other comment. I forget this PR was a phrase_corrections one.)

As a sidenote, while "two" is a third homophone for these, I've never seen it mixed up for the others in either direction. People seem to find one difference a lot more salient than the other.

(&["life is to short"], &["life is too short"]),

// "one to many" has many false positives
(&["put to fine a point"], &["put too fine a point"], ),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can take a more data driven approach to this file as a whole. Is there a public dataset of these kinds of corrections?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah sorry. In my previous comment I forget the way I made this PR! For now I think ToTooIdioms and/or TooToIdioms or maybe ...InIdioms, but I still think putting them behind a single setting is best, even when some are phrase corrections and some are a dedicated linter. (Not Idiomatic since that term has a broader meaning than idiom.)

Since these ones are all in idioms I used the Wiktionary article for "too", which links all the idioms it's a part of. I checked them one by one since some are too marginal to include.

You could consider that a kind of "data driven". Idioms are special so finding lists of them already assembled is probably the way. For fixing "to" vs "too" in the general case outside of idioms a more "big data" approach analysing neighbouring word properties would be the best. As for all general grammar linters, especially syntax. It would require a good corpus of good English and a good corpus of bad English as well, ideally.

Copy link
Collaborator

@elijah-potter elijah-potter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly ready, I just have the one minor nitpick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants