Skip to content

use new unique jobs implementation #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 27, 2025
Merged

use new unique jobs implementation #38

merged 5 commits into from
Feb 27, 2025

Conversation

bgentry
Copy link
Contributor

@bgentry bgentry commented Jan 12, 2025

This moves the library to use the new unique jobs implementation from riverqueue/river#590 and migrates the sqlalchemy driver to use a unified insertion path, allowing bulk inserts to use unique jobs.

It's intended to be analogous to riverqueue/riverqueue-ruby#32.

Outstanding issues:

  • There's one failing test that I think is due to some mocking related issue and I haven't yet been able to figure it out. Maybe you'd be able to take a crack at that?
  • I think it also needs more comprehensive test coverage for the various unique options
  • by_args needs support for partial keys, and needs to use sorted json before hashing. Tests are essential for this.
  • There are a couple of type misalignment issues where I'm wanting to insert null values into the unique_key or unique_states fields, because sqlc generates code that doesn't allow None values in these input lists. I'm not seeing any clear way to resolve that.

@bgentry bgentry requested a review from brandur January 12, 2025 19:57
brandur added a commit that referenced this pull request Jan 25, 2025
This one's related to #38. It turns out that while trying to mock a
context manager kind of works, it will do the wrong thing in edge cases
like when an exception is thrown from inside a `with` block, silently
swallowing it and causing a return that's completely wrong.

There may be some way to fix the mock to make it do the right thing, but
instead of getting fancier with these mocks that are already awful,
instead repair the problem by defining a plain class that implements
context manager and just use that.
brandur added a commit that referenced this pull request Jan 25, 2025
This one's related to #38. It turns out that while trying to mock a
context manager kind of works, it will do the wrong thing in edge cases
like when an exception is thrown from inside a `with` block, silently
swallowing it and causing a return that's completely wrong.

There may be some way to fix the mock to make it do the right thing, but
instead of getting fancier with these mocks that are already awful,
instead repair the problem by defining a plain class that implements
context manager and just use that.
brandur added a commit that referenced this pull request Jan 25, 2025
This one's related to #38. It turns out that while trying to mock a
context manager kind of works, it will do the wrong thing in edge cases
like when an exception is thrown from inside a `with` block, silently
swallowing it and causing a return that's completely wrong.

There may be some way to fix the mock to make it do the right thing, but
instead of getting fancier with these mocks that are already awful,
instead repair the problem by defining a plain class that implements
context manager and just use that.
brandur added a commit that referenced this pull request Jan 25, 2025
This one's related to #38. It turns out that while trying to mock a
context manager kind of works, it will do the wrong thing in edge cases
like when an exception is thrown from inside a `with` block, silently
swallowing it and causing a return that's completely wrong.

There may be some way to fix the mock to make it do the right thing, but
instead of getting fancier with these mocks that are already awful,
instead repair the problem by defining a plain class that implements
context manager and just use that.
@brandur
Copy link
Contributor

brandur commented Jan 25, 2025

There's one failing test that I think is due to some mocking related issue and I haven't yet been able to figure it out. Maybe you'd be able to take a crack at that?

This took an absurdly long time to figure out, but the problem was that the context manager mock to return an executor was swallowing a thrown AssertionError in this with block:

    def insert_many(self, args: List[JobArgs | InsertManyParams]) -> list[InsertResult]:

        with self.driver.executor() as exec:
            return self._insert_many_exec(exec, args)

Thereby causing insert_many to return None and then insert fail as it tried to index element [0].

I put in a fix in #39 which seems to work. Take a look at that, we can merge it, and then it should fix the tests over here.

Medium term I want to strip all these mocks out. They always seem like a good idea, but every time I use them I remember why they're so painful to work with.

I'm going to time out on this PR for the day since I already spent too much time on it, but if you get blocked on one of your other work streams, maybe see if you can start looking into your no. 2 and no. 3 checkboxes.

There are a couple of type misalignment issues where I'm wanting to insert null values into the unique_key or unique_states fields, because sqlc generates code that doesn't allow None values in these input lists. I'm not seeing any clear way to resolve that.

Didn't get a chance to look at this one, but it's possible it'll require an additional sql parameter like unique_states_is_null that gets based into the SQL query.

brandur added a commit that referenced this pull request Jan 25, 2025
This one's related to #38. It turns out that while trying to mock a
context manager kind of works, it will do the wrong thing in edge cases
like when an exception is thrown from inside a `with` block, silently
swallowing it and causing a return that's completely wrong.

There may be some way to fix the mock to make it do the right thing, but
instead of getting fancier with these mocks that are already awful,
instead repair the problem by defining a plain class that implements
context manager and just use that.
@bgentry bgentry force-pushed the bg-newer-unique-jobs branch from c65e79f to 3899985 Compare January 29, 2025 16:11
This moves the library to use the new unique jobs implementation from
riverqueue/river#590 and migrates the sqlalchemy
driver to use a unified insertion path, allowing bulk inserts to use
unique jobs.
@bgentry bgentry force-pushed the bg-newer-unique-jobs branch from 3899985 to 31a2cd1 Compare January 29, 2025 16:14
@bgentry
Copy link
Contributor Author

bgentry commented Jan 29, 2025

@brandur nice, looks like #39 fixed the test issues. I rebased and now everything is passing from make test 🙌

@brandur
Copy link
Contributor

brandur commented Feb 15, 2025

@bgentry What's the next step with respect to this one? (i.e. Do you have what you need to keep going?) Just in light of Eric checking back in today, it probably makes sense to try and make sure there's no known bugs left in this package.

Copy link
Contributor

@brandur brandur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn it, I wrote this days ago and forgot to complete the review.

@@ -82,14 +82,16 @@ class UniqueOpts:
args and queues. If either args or queue is changed on a new job, it's
allowed to be inserted as a new job.

TODO update description ⚠ ⚠️ ⚠
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO here.

}

# Serialize with sorted keys and append to unique key:
sorted_args = json.dumps(args_to_include, sort_keys=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that just came to mind: are the JSON encodings between Go, Ruby, and Python identical on this front? Do they omit unnecessary spaces, trailing newlines, etc? Unique conflicts won't be detected properly if not.

Copy link
Contributor

@brandur brandur Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah that might be a problem. It looks like Ruby has the same behavior as Go with minimal spacing between keys:

> JSON.dump({'z': 'last', 'a': 1}.sort.to_h)
=> "{\"a\":1,\"z\":\"last\"}"

But Python puts spacing in by default:

>>> json.dumps({'z': 'last', 'a': 1}, sort_keys=True)
'{"a": 1, "z": "last"}'

However, they've got a really convenient separators keyword to dumps that takes the whitespace out and makes it like Ruby/Go:

>>> json.dumps({'z': 'last', 'a': 1}, sort_keys=True, separators=(',', ':'))
'{"a":1,"z":"last"}'

Would be worth adding that in here for sure!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks for digging in to confirm!

@bgentry bgentry force-pushed the bg-newer-unique-jobs branch from 82f4348 to 92aed86 Compare February 27, 2025 02:32
@bgentry bgentry force-pushed the bg-newer-unique-jobs branch from 92aed86 to 273a641 Compare February 27, 2025 02:35
@bgentry bgentry marked this pull request as ready for review February 27, 2025 02:35
@bgentry bgentry requested a review from brandur February 27, 2025 02:36
Copy link
Contributor

@brandur brandur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo one comment on use of separators from yesterday, :shipit: .

@bgentry bgentry merged commit f27a3bb into master Feb 27, 2025
4 checks passed
@bgentry bgentry deleted the bg-newer-unique-jobs branch February 27, 2025 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants