-
Notifications
You must be signed in to change notification settings - Fork 26
Fix dollarQuotedString() to handle inputs that contain the randomly chosen quote tag #16
base: master
Are you sure you want to change the base?
Conversation
Add a test for dollarQuotedString() to showcase how it fails to handle the randomly chosen dollar quote tag appearing in the string to be quoted. The test works by repeatedly trying to quote $a$ until the string itself is the randomly chosen dollar quote tag.
Fixes a bug in dollarQuotedString() where it fails to account for the case where the string being quoted contains the randomly chosen dollar quote tag. Before the fix, if a string containing $a$ there is a one in 21 (number of values in randomTags) chance of it occurring. With the fix the function now checks if the chosen dollar quote tag is contained within the string to be quoted and if so chooses a new longer random tag. The tag initially starts as the empty string so the initial default choice for quoting becomes $$. Random additions to the tag were chosen, over incrementing a counter, to make it slightly more difficult for an attacker to supply an input string that would require a large number of iterations. As the new defualt is to use $$ for quoting, a number of the results for unit tests have been simplified to match on the exact text that would be returned.
Change the set of characters used to generate random tags for dollar quotes to include both lower and upper case non-vowel characters and the digits two (2) through nine (9). The overall expanded set decreases the likelyhood of a randomly picked tag existing in the contained input. The exclusions are to ensure that generated tags do not randomly end up including naughty words.
Can you please update this to more clearly show the problem and explain the bug? |
If you try to dollar quote a literal string that contains Here's an example. Say we have a web service that uses this lib like so: // Get some user input
var value = req.query.userInput;
// Escape it with this lib
var escapedValue = escape.escapeDollarQuoted(value);
// Build a query with the escaped value
var sql = 'SELECT ' + escapedValue + ' AS x';
// Execute the query
var results = await db.query(sql);
// Send back results to our client
res.send(results); Now assume that the random tag that gets picked is var value = 'some-user-input';
var escapedValue = '$a$' + value + '$a$';
// SELECT $a$some-user-input$a$ as x
var sql = 'SELECT ' + escapedValue + ' AS x';
// ... execute sql and return result to user ... and for the common case it's fine. Now let's say the user input includes var value = 'foo$a$bar';
var escapedValue = '$a$' + value + '$a$';
// SELECT $a$foo$a$bar$a$ as x
var sql = 'SELECT ' + escapedValue + ' AS x';
// ... execute sql and return result to user ... the quotes won't align and the SQL will most likely have a syntax error. Now let's say the user knows the structure of what you're executing and includes var value = '$a$ as x UNION ALL SELECT row_to_json(t.*)::text FROM customer_stuff t UNION ALL SELECT $a$';
var escapedValue = '$a$' + value + '$a$';
var sql = 'SELECT ' + escapedValue + ' AS x';
// ... execute sql and return result to user Now the SQL becomes: SELECT $a$$a$ as x UNION ALL SELECT row_to_json(t.*)::text FROM customer_stuff t UNION ALL SELECT $a$$a$ AS x ...and all your customer secrets just got exported. As it's only a 1/20 chance to match the dollar quoting tag, the malicious user would only need to make a handful of requests until it triggers the issue. |
Increasing the pool size of the random strings is definitely a good idea. We could also add another layer of randomness, so when the library is initiated it will randomize the pool of tags as well. IMO empty tags should not be allowed at all, as it completely defeats the purpose of having them in the first place. |
Randomizing the set of characters prior to randomly pulling a character from it does not make things any more random or secure. In theory you could replace the
No it doesn't. The point of the function is to coerce the input to a string and return back the dollar quoted version of it. The shortest, and thus best, choice for dollar quotes is the one without any tag whatsoever, For reference: https://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING The choice of a non-empty tag value only comes into play when the original string contains Choosing it via random addition of characters is a fine idea, as opposed to sequentially picking it, as it's less likely a malicious user could craft a string that would require a large number of iterations till a non-matching dollar quote tag is picked. If it was purely deterministic, say try |
@sehrope Regarding the empty tag I don't think dollar quoted strings were meant defeat a brute force sql injection attack. I suspect that the context of dollar quoted strings is to prevent accidental breakage of the sql rather than an intentional one - but it's just my assumption. I hope this explains better my motivation behind using random tags this way. |
@sehrope I also kinda like this idea:
Though it will beef up the sql size by a lot. Do you really think we'll need to check for the existance of a 192 bits of random text inside the value ? |
@kessler said:
Sure but that's just wrong. A function that can "accidentally" breaks with chance of 1/20 isn't very useful. For it to be usable it needs to work in every situation.
No and that's not how my proposed path to fix this works. I'd suggest you read the code changes in the PR. That comment was in reference to picking a long enough, and cryptographically secure, random value for the tag that it would be unlikely to occur in any input text. With enough bits (and 192 is more than enough) you don't have to worry about the string randomly appearing in the input as it's statistically impossible. Unfortunately to do that you'd need a very long string as with 48-choices per character, that only buys you Something like that would be more suitable to a client dynamically building an escaped SQL without being able to analyze the input first. Say streaming it out. In that situation you can't guess a valid short dollar quote tag as you don't know the totality of the input yet, so you'd have to pick something obscenely large to make sure it can't be found in the input. Like say Anyway, none of that is needed though. The proposal in the PR leads to the shortest possible quoting (i.e. just |
Don't agree on that.
I did not say that it's your proposed fix, just that I like the idea. Really, I was just trying to create a friendly dialog here, but you are clearly set on showing you're the smartest person in the room... so, good luck with that. |
No, I'm just explaining why the existing code is wrong so that it gets fixed. |
@sehrope You're just patronizing, since the beginning of this thread... and that's just sad. When people engage in a dialog they do things like ask questions and listen to what other people have to say. I have quite a few arguments about why I think you're wrong in some of your assumptions, but clearly you are not interested in hearing anything but yourself. I wouldn't be suprised if you're also the kind of guy that must have the last word in every argument. |
…ows the same rules as an unquoted identifier, except that it cannot contain a dollar sign."
@sehrope to proceed with this discussion. As I stated earlier I do see your point regarding the empty tag. I feel very strongly that you should be open to, at least, hear other people's opinions. It is possible that a discussion will yield an even better solution. |
I found a small issue with the PR. The "should handle dollar quotes in the string being escaped without resorting to luck" test has, I think, a problem. Because it has no "$$" within it (only "$a$"), I think it is getting escaped from: Putting an additional $$ in the middle of the test string will correctly test the intended behavior. I made this change and it worked correctly. That aside, this change looks good to me, and I think it is valuable in security terms. I hope it is accepted. |
Three commits in this PR:
dollarQuotedString(...)
failing to handle strings like$a$
. As the dollar quote tags are randomly chosen it's possible nobody ever noticed this. In practice it's easy to generate and any web service that uses this to quote user input could be trivially bombarded with enough attempts to trigger the bug (it's a 1/20 chance each time).dollarQuotedString(...)
and cleans up some tests. The new code also has the positive side effect of using a plain$$
as the first choice for dollar quoting (i.e. tagless). In the common case where it'd be acceptable (i.e.$$
does not appear in the input string), that would shrink the output by two characters.