Skip to content

Fix various counters #542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

mdashlw
Copy link
Contributor

@mdashlw mdashlw commented May 1, 2025

Before you begin

  • I understand my contributions may be rejected for any reason
  • I understand my contributions are for the benefit of Derpibooru and/or the Philomena software
  • I understand my contributions are licensed under the GNU AGPLv3
  • I understand all of the above

  • Fixes topics_count on users
  • Destroyed comments no longer count for comments_count on images and comments_posted_count on users
  • Destroyed posts no longer count for post_count on topics/forums and forum_posts_count on users
  • Hidden topics no longer count for topic_count on forums and topics_count on users
  • Also in this PR: updates last_post_id on forums and topics on hide/unhide/move.

Depends on #544

Comment on lines +194 to +195
UserStatistics.inc_stat(comment.user, :comments_posted, -1)
reindex_comment(comment)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inc_stat could be part of the same transaction to make it fully atomic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liamwhite any reason these aren't in transactions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mistake on my behalf. Generally they should be part of the transaction that does the actual stat-incrementing action

set: [
last_post_id:
fragment(
"SELECT max(posts.id) FROM posts JOIN topics ON posts.topic_id = topics.id WHERE topics.forum_id = ? AND topics.hidden_from_users IS FALSE AND posts.hidden_from_users IS FALSE",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's split the long SQL line

Suggested change
"SELECT max(posts.id) FROM posts JOIN topics ON posts.topic_id = topics.id WHERE topics.forum_id = ? AND topics.hidden_from_users IS FALSE AND posts.hidden_from_users IS FALSE",
"""
SELECT max(posts.id)
FROM posts
JOIN topics ON posts.topic_id = topics.id
WHERE topics.forum_id = ?
AND topics.hidden_from_users IS FALSE
AND posts.hidden_from_users IS FALSE
""",

Comment on lines +111 to +116
## Examples

iex> update_forum_last_post_query(1)
#Ecto.Query<...>

"""
Copy link
Contributor

@MareStare MareStare May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for me I don't see any use in such made-up examples. In fact, I almost never look at them in code such as this, because a much better example would be to just search for the actual usage of the function, rather than see it in a doc comment that passes a dummy forum ID. @liamwhite do you think this is really necessary?

|> update(set: [last_post_id: nil])

post = Post.hide_changeset(post, attrs, user)
post = post |> Repo.preload(:topic)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how much of the code runs outside of the transaction. Ideally, the entire operation starting from the root should be part of the transaction. We'll need to solve this problem in the long run, I'm not sure how much we can do in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. It's only preloads and reindexing which shouldn't be in any transaction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the preload should be part of the transaction. Imagine something modifies the topic between this preload and the transaction below - then you have stale data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a wider problem. For example here:

post = conn.assigns.post
user = conn.assigns.current_user
case Posts.hide_post(post, post_params, user) do

The conn.assigns already contains some data fetched from the DB outside of the transaction and by the time this logic runs any data may have already changed. Instead, there should be a transaction started right at the beginning of the request lifecycle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this would achieve. Stuff can change mid-transaction just as easily

Copy link
Contributor

@MareStare MareStare May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stuff can not change mid transaction because transactions (with select for update*) pessimistically lock records that they read/mutate

|> Multi.update_all(:forums, forums, [])
|> Multi.update(:post, Post.hide_changeset(post, attrs, user))
|> Multi.update_all(:reports, Reports.close_report_query({"Post", post.id}, user), [])
|> Multi.update_all(:topic, Topics.update_topic_last_post_query(post.topic_id), [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking. Do we really need to have these last_*_id denormalized fields? Is it really that expensive for postgres to go and fetch the latest record from the the table during reads?

Requires @liamwhite's opinion as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be expensive due to tuple filtering on hidden_from_users. But if most posts in a topic are not hidden_from_users, then a simple left lateral join would suffice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, but we could also add an index for hidden_from_users field, right? That will simplify our life removing one place where we need to maintain the denormalization

@Meow
Copy link
Member

Meow commented May 2, 2025

I merged #544, revise the code, as it seems incorrect now. Also rebase to latest code.

@Meow Meow added this to the 1.2 milestone May 9, 2025
@MareStare MareStare added the waiting on author Further progress is blocked by an action from the author of the PR/issue label May 10, 2025
@mdashlw mdashlw marked this pull request as draft May 11, 2025 07:50
@Meow Meow removed this from the 1.2 milestone May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting on author Further progress is blocked by an action from the author of the PR/issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants