-
Notifications
You must be signed in to change notification settings - Fork 73
Fragment parameter fixes #852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pull Request Test Coverage Report for Build 16131346449Details
💛 - Coveralls |
The `url_ext` mod is itself feature gated, so any declarations within it are already implicitly v2 only.
Although RFC 3986 (URIs) does not assign any special meaning to `+`, in fragment parameters or in general, RFC 1866 (HTML 2.0) section 7.5 uses it as a delimiter for keywords in query parameters. As a result some URI libraries interpret `+` in URIs as ` `, even in fragment parameters. Although not insurmountable (such transformation BIP 77 URIs is reversible because ` ` is not used and `+` was only used for fragment parameter delimitation) this presents friction and is in general confusion, so to improve compatibility with such libraries `-` is now used instead. It has no reserved meaning as a sub-delimiter. For the time being when parsing both `+` and `-` will be accepted, but only `-` will be used when encoding fragment parameters.
19e9cf2
to
452e4b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not sure that the last commit needs to go in (other than the base64-bech32 typo fix). What kind of errors does that prevent that would not otherwise be caught by bech32 parsing?
payjoin/src/core/uri/url_ext.rs
Outdated
// check for allowed delimiters | ||
if c == b'-' { | ||
has_dash = true; | ||
} else if c == b'+' { | ||
has_plus = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be in the loop? Can't fragment.contains('-') and fragment.contains('+') be used outside of the loop in the match for greater legibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't agree that that's more legibile, since these conditions are mutually exclusive with the charset range so conceptually it seems even more confusing to put these characters in the range of characters that are also included
not that efficiency really matters, but also scanning through the fragment once instead of 3 times seems reasonable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i roughly implemented this suggestion but i think i still prefer the older approach since repeating the logic with c != '-' && c != '+'
places this information about the allowed delimiters in two different places instead of just one, which i find less legible than the slightly clunkier if else stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I understand why you'd put it all together now and am comfortable with that though would merge this as-is at this point.
AmbiguousDelimiter, | ||
} | ||
|
||
fn check_fragment_delimiter(fragment: &str) -> Result<char, ParseFragmentError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function does a hell of a lot more than check the fragment delimiter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please suggest a better name assuming the newly proposed behavior (check no fragment ambiguity and that the fragment ~= /^[A-Z0-9+\-]*$/
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check_fragment_charset ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i did not take the suggested name because Result<char, ParseFragmentError>
is not self explanatory in terms of what this method returns, determining the delimiter to use is the important thing this function calls and checking the charset is kinda of a precondition
Yeah I think I agree, by the time I wrote it, especially the error conversion boilerplate I had kinda regretted it, but I pushed anyway so we can discuss. I'm in favor of reverting to the more permissive behavior, or something intermediate.
(1) can be done as (2) seems unnecessary anyway, future extension mechanisms should be allowed as per BIP 77, and the behavior i implemented is too restrictive (3) can be enforced as just checking that there's no lowercase chars i will replace the last commit with a much more minimal one that does not attempt to do any bech32 charset validation as sketched in this reply |
Sounds great. Looking forward to the more minimal final commit that does not attempt to do any bech32 charset validation. |
Also, long term I would prefer something that avoids this mutation based approach entirely, using a builder pattern to queue up fields, and then just joining them instead of parsing and mutating to set would be much simpler, but I didn't want to redesign, if you're cACK i'll write this up in an issue |
Previously `set_param` would did not preserve order, but the way that `set_param` was called ended up setting the RK, OH and EX fragment parameters in reverse lexicographical order. To avoid any privacy leaks from URI construction (revealing the specific software the receiver is using) the spec now requires fragment parameters to be ordered lexicographically, so `set_param` now ensures this.
11c5768
to
51c4147
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 51c4147
Ambiguity in the fragment parameter delimiter or any invalid characters are no longer allowed. The HRPs EX, OH, and RK are within the uppercase bech32 character set. Only this character set along with the HRP delimiter `1` are now allowed, with either `+` or `-` as a delimiter (but not both).
51c4147
to
0eb74b9
Compare
if !(b'0'..b'9' + 1).contains(&c) | ||
&& !(b'A'..b'Z' + 1).contains(&c) | ||
&& c != b'-' | ||
&& c != b'+' | ||
{ | ||
return Err(ParseFragmentError::InvalidChar(c.into())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of this syntax?
if !(b'0'..b'9' + 1).contains(&c) | |
&& !(b'A'..b'Z' + 1).contains(&c) | |
&& c != b'-' | |
&& c != b'+' | |
{ | |
return Err(ParseFragmentError::InvalidChar(c.into())); | |
} | |
if !matches!(c, b'0'..=b'9' | b'A'..=b'Z' | b'-' | b'+') { | |
return Err(ParseFragmentError::InvalidChar(c.into())); | |
} |
|
||
if !fragment.is_empty() { | ||
fragment.push('+'); | ||
match (has_dash, has_plus) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I realize now the reason you had the has_dash, has_plus variables was to make the scanning operation O(1n) wrt the length of the fragment and not O(3n). If not that, why not?
match (has_dash, has_plus) { | |
match (fragment.contains('-'), fragment.contains('+')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 0eb74b9
What changed last night after my prior ACK? Looks good to me.
This PR contains two changes, using
-
instead of+
as the fragment parameter delimiter, and lexicographically ordering the fragment parameters.This implements bitcoin/bips#1890