feat: to/too mixups in set phrases #1839

hippietrail · 2025-08-31T07:20:48Z

Issues

Help out with #1832

Description

Looks for idioms and set phrases that should use "too" where instead "to" has been used.
Also one case of the opposite.
Also moved and expanded "spoke to soon" from phrase_corrections to phrase_set_corrections.

I only included those that got more than a few search hits.
For others which got plenty of search hits but also lots of false positives, I included comments so people won't hastily try to add them.

How Has This Been Tested?

I added one unit test for each pattern that's flagged, all or most sourced from GitHub.

Checklist

I have performed a self-review of my own code
I have added tests to cover my changes

elijah-potter

Looks mostly ready, I just have the one minor nitpick.

elijah-potter · 2025-09-08T14:23:17Z

harper-core/src/linting/phrase_set_corrections/mod.rs

@@ -353,6 +353,49 @@ pub fn lint_group() -> LintGroup {
            "Corrects `rise the question` to `raise the question`.",
            LintKind::Grammar
        ),
+        "ToToo" => (


I'm worried this would get confused with the ToTwoToo rule. Maybe IdiomaticToToo?

I'm worried this would get confused with the ToTwoToo rule. Maybe IdiomaticToToo?

I think there's a tension between names which will be useful for users since they're used in the options, and names which don't conflict for the programmers.

I think we were working on these at the same time. All uses of "to" and "too" are idiomatic. The proper solution is surely to merge them by way of a LintGroup. I think but am not sure that there's not a way to group lints together so that only the name of the group is exposed to user? I looked into that a week ago but forget.

(But see my other comment. I forget this PR was a phrase_corrections one.)

As a sidenote, while "two" is a third homophone for these, I've never seen it mixed up for the others in either direction. People seem to find one difference a lot more salient than the other.

elijah-potter · 2025-09-08T14:26:16Z

harper-core/src/linting/phrase_set_corrections/mod.rs

+                (&["life is to short"], &["life is too short"]),
+
+                // "one to many" has many false positives
+                (&["put to fine a point"], &["put too fine a point"], ),


I'm wondering if we can take a more data driven approach to this file as a whole. Is there a public dataset of these kinds of corrections?

Oh yeah sorry. In my previous comment I forget the way I made this PR! For now I think ToTooIdioms and/or TooToIdioms or maybe ...InIdioms, but I still think putting them behind a single setting is best, even when some are phrase corrections and some are a dedicated linter. (Not Idiomatic since that term has a broader meaning than idiom.)

Since these ones are all in idioms I used the Wiktionary article for "too", which links all the idioms it's a part of. I checked them one by one since some are too marginal to include.

You could consider that a kind of "data driven". Idioms are special so finding lists of them already assembled is probably the way. For fixing "to" vs "too" in the general case outside of idioms a more "big data" approach analysing neighbouring word properties would be the best. As for all general grammar linters, especially syntax. It would require a good corpus of good English and a good corpus of bad English as well, ideally.

elijah-potter

Looks mostly ready, I just have the one minor nitpick.

hippietrail added 3 commits August 31, 2025 16:16

feat: to/too mixups in set phrases

3472777

Merge branch 'master' into too-to

6c74d45

Merge branch 'master' into too-to

07420a6

elijah-potter reviewed Sep 8, 2025

View reviewed changes

elijah-potter requested changes Sep 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: to/too mixups in set phrases #1839

feat: to/too mixups in set phrases #1839

Uh oh!

hippietrail commented Aug 31, 2025

Uh oh!

elijah-potter left a comment

Uh oh!

elijah-potter Sep 8, 2025

Uh oh!

hippietrail Sep 8, 2025 •

edited

Loading

Uh oh!

elijah-potter Sep 8, 2025

Uh oh!

hippietrail Sep 8, 2025

Uh oh!

elijah-potter left a comment

Uh oh!

Uh oh!

feat: to/too mixups in set phrases #1839

Are you sure you want to change the base?

feat: to/too mixups in set phrases #1839

Uh oh!

Conversation

hippietrail commented Aug 31, 2025

Issues

Description

How Has This Been Tested?

Checklist

Uh oh!

elijah-potter left a comment

Choose a reason for hiding this comment

Uh oh!

elijah-potter Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

hippietrail Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elijah-potter Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

hippietrail Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

elijah-potter left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hippietrail Sep 8, 2025 •

edited

Loading