Skip to content

Database decoupling #145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 6, 2025
Merged

Conversation

michael-kenzel
Copy link
Contributor

@michael-kenzel michael-kenzel commented Apr 27, 2025

Another step towards untethering Wheatley from TCCPP by refactoring the hardcoded database structures into something more modular:

  • The bot singleton is replaced with a component_state collection where each component can keep its own state document with a separate id.
  • Instead of one fixed database proxy, any number of proxies can be created to provide component-specific views into the database using partial schema.
    • This means that there's no more checking to enforce one coherent global schema. But it also means that bot modules that live outside of the main repo can add their own bits to the database without the bot core needing to be changed specifically to accommodate them.
  • All database handling should now be self-initializing,
    • i.e., work correctly on top of a completely empty database.
    • A key change here is that counters like the case_number now represent the current latest id instead of the id for the next entry. This was done so we can rely on upserts and the fallback behavior of database operators to automatically create entries that do not exist.
  • Schemata are moved into module-specific files.
  • Some components are moved into more fitting places,
    • e.g., modmail.ts into moderation/.
  • A migrade_db() function during startup automatically migrates an old-format db with bot singleton to the new component_state format.
    • The old wheatley collection is left in place to avoid data loss.

@michael-kenzel michael-kenzel force-pushed the database-decoupling branch 2 times, most recently from d770aa2 to a59cd45 Compare April 28, 2025 02:12
@michael-kenzel michael-kenzel marked this pull request as ready for review April 28, 2025 02:25
@michael-kenzel michael-kenzel force-pushed the database-decoupling branch 5 times, most recently from 260abd7 to f64561f Compare April 28, 2025 18:59
Copy link
Member

@jeremy-rifkin jeremy-rifkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for taking the time to do this, this looks fantastic! I've read through most of it and left some quick comments below

@michael-kenzel michael-kenzel force-pushed the database-decoupling branch from f64561f to cc8f805 Compare May 3, 2025 00:58
Copy link
Member

@jeremy-rifkin jeremy-rifkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much again for doing this, this looks fantastic. Just a few comments

Comment on lines +143 to +147
const state = await this.database.component_state.findOne({ id: "starboard" });
this.delete_emojis = state?.delete_emojis ?? [];
this.ignored_emojis = state?.ignored_emojis ?? [];
this.negative_emojis = state?.negative_emojis ?? [];
this.repost_emojis = state?.repost_emojis ?? [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the component state doesn't exist, should we error or insert it?

Copy link
Contributor Author

@michael-kenzel michael-kenzel May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, I went this route because I thought it'd be slightly simpler and easier to maintain as there's only one place where state is written to the db. I would very much prefer to keep it some way that works out-of-the-box with an empty or only partially initialized db though, so erroring is imo not a great option.

import { Wheatley } from "../wheatley.js";
import { EarlyReplyMode, TextBasedCommandBuilder } from "../command-abstractions/text-based-command-builder.js";
import { TextBasedCommand } from "../command-abstractions/text-based-command.js";
import { unwrap } from "../../../utils/misc.js";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should starboard stuff be in the tccpp modules? Might GP want that some day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we can deal with making it a proper core component when we get there. In its current form, it is tied to the TCCPP channel structure, so I thought easier to just keep it in the tccpp module for now.

modmail_id: 1,
},
},
{ upsert: true, returnDocument: "after" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want "before" here for current modmail id logic but that might behave weirdly if the upsert is something you'd like to rely on

Copy link
Contributor Author

@michael-kenzel michael-kenzel May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's actually what I initially did, but that then introduces the need to handle the case of the call returning null, so this turned out to be simpler. Either way would work though, it's just an additional check.


type PurgableChannel = Exclude<Discord.TextBasedChannel, Discord.DMChannel | Discord.PartialDMChannel>;
type PurgableMessages = Discord.Collection<string, Discord.Message> | string[];
type PurgeWork = [PurgableChannel, Iterable<PurgableMessages> | AsyncGenerator<PurgableMessages>];

export type message_database_entry = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned with this falling out of line with the more complete definition, unless that's what you were asking about asserting the other day

Copy link
Contributor Author

@michael-kenzel michael-kenzel May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's precisely why I was asking about that thing the other day. I assume you saw what I did in the other place to take care of this?

src/wheatley.ts Outdated
@@ -240,7 +236,7 @@ export class Wheatley {

private command_handler: CommandHandler;

database: WheatleyDatabaseProxy;
database: WheatleyDatabase;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be WheatleyDatabase | null since it's conditionally set later

Copy link
Contributor Author

@michael-kenzel michael-kenzel May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also thought that. Problem is that this then creates the necessity to handle the case of the database being null in literally every place that does anything with the database. I assume that's why it wasn't WheatleyDatabaseProxy | null so far either? I guess we just unwrap(this.wheatley.database) since we currently don't really have a good way for components to fail to load in case the bot config isn't agreeable to them? Ideally, we would probably have an in-memory database mockup or smth that's used as a fallback?

src/wheatley.ts Outdated

const bot_singleton = await proxy.wheatley.findOne({ id: "main" });
if (bot_singleton) {
M.log("migrating database…");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor:

Suggested change
M.log("migrating database");
M.log("migrating database...");

Copy link
Contributor Author

@michael-kenzel michael-kenzel May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can change, but I thought we're already using UTF-8 in other places so might as well 😆

@michael-kenzel michael-kenzel force-pushed the database-decoupling branch from cc8f805 to 792b020 Compare May 4, 2025 16:19
@michael-kenzel michael-kenzel force-pushed the database-decoupling branch from 792b020 to 274901a Compare May 5, 2025 01:24
Copy link
Member

@jeremy-rifkin jeremy-rifkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@jeremy-rifkin jeremy-rifkin merged commit 65a568a into TCCPP:main May 6, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants