Skip to content

Conversation

brendan-kellam
Copy link

This PR adds the repo_id option to zoekt-git-index, allowing us to plumb a repository's database ID (Sourcebot ID) into it's corresponding shard(s) at index time. When an ID for a repository is present, it will surface as the RepositoryID property for each file match (see here). This is useful for correlating additional metadata from the db.

@brendan-kellam brendan-kellam requested a review from msukkari May 12, 2025 23:32
@msukkari
Copy link

/review

deltaShardNumberFallbackThreshold := flag.Uint64("delta_threshold", 0, "upper limit on the number of preexisting shards that can exist before attempting a delta build (0 to disable fallback behavior)")
languageMap := flag.String("language_map", "", "a mapping between a language and its ctags processor (a:0,b:3).")
tenantID := flag.Int("tenant_id", 0, "tenant ID to use for indexed repositories")
repoID := flag.Uint("repo_id", 0, "opaque ID to use for indexed repositories. Surfaces as `RepositoryID` in the REST search response.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new flag repoID is added but this flag's value is never used or passed along to indexing options. You should integrate repoID into the relevant indexing option struct, for example, setting it on opts or gitOpts.RepositoryDescription to ensure it takes effect during indexing.

opts := cmd.OptionsFromFlags()
opts.IsDelta = *isDelta
opts.RepositoryDescription.TenantID = *tenantID
opts.RepositoryDescription.ID = uint32(*repoID)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new assignment opts.RepositoryDescription.ID = uint32(*repoID) references *repoID which is not declared or initialized anywhere in the provided context. Please ensure that repoID is declared as a flag or variable and properly parsed before usage.

Comment on lines +254 to +258
idString := sec.Options.Get("repoid")
if idString != "" {
id, _ := strconv.ParseUint(idString, 10, 32)
desc.ID = uint32(id)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error from strconv.ParseUint is ignored; this could cause silent failures if 'repoid' is malformed. Consider returning or logging the error to handle invalid input gracefully.

Comment on lines +254 to +258
idString := sec.Options.Get("repoid")
if idString != "" {
id, _ := strconv.ParseUint(idString, 10, 32)
desc.ID = uint32(id)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this change needed?

Copy link
Author

@brendan-kellam brendan-kellam May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sec.Options is a collection of all zoekt prefixed git configs that are set in repo compile utils. E.g.,:

image

Before this change, if zoekt.repoid is not set, id will resolve to 0 and override anything that is already set in desc.ID. We could use zoekt.repoid to set the id instead of plumbing it like we are here, but this was easier since we don't know the repo's id when we construct the zoekt.* payload (in gitConfig)

@brendan-kellam brendan-kellam merged commit 12a2f4a into main May 13, 2025
7 of 9 checks passed
@brendan-kellam brendan-kellam deleted the bkellam/plumb_id branch May 13, 2025 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants