feat: add worker pool and optimize message processing #172

SeUkKim · 2025-10-01T02:52:53Z

Summary by CodeRabbit

Refactor
- Added a configurable worker pool for parallel message processing and graceful shutdown so in-flight work completes cleanly.
- Reduced read polling timeout and treated timeouts as non-fatal for more responsive processing.
- Enhanced per-worker logging and error tagging for clearer traceability.
Chores
- Increased DB batch size to 200 and shortened query timeout.
- Added standardized DB client initialization and a connectivity health check.
- Exposed max-workers via environment variable and command flag.
- Updated ignored paths and removed linter configuration.

coderabbitai · 2025-10-01T02:53:00Z

Walkthrough

Adds Config.MaxWorkers and a concurrent worker pool to the indexer with 1s Kafka polling and per-worker Sentry/logging; adds CLI flag/env wiring for max-workers. Updates DB: BatchSize → 200, QueryTimeout → 30s, and adds GORM NewClient and Ping helpers.

Changes

Cohort / File(s)	Summary
Indexer concurrency & polling `event-indexer/indexer/indexer.go`	Adds `Config.MaxWorkers`; replaces single-threaded `ReadMessage` loop with a buffered `workChan` (size `MaxWorkers*2`) and spawns `MaxWorkers` `messageWorker` goroutines; reduces `ReadMessage` timeout to 1s and treats timeouts as non-fatal; workers create per-message contexts, set Sentry `worker_id`, handle DLQ/commit and support graceful shutdown.
CLI: max workers flag & wiring `event-indexer/cmd/indexer/cmd.go`	Adds `FlagMaxWorkers = "max-workers"`, reads `MAX_WORKERS` env var (fallback), exposes `--max-workers` flag, and wires `maxWorkers` into the `indexer.Config` passed to `New`.
DB client utilities & sizing `pkg/db/db.go`	Changes `BatchSize` from `100` → `200`; changes `QueryTimeout` from `5m` → `30s`; adds `NewClient(databaseURL string) (*gorm.DB, error)` to open/configure a GORM Postgres client (PrepareStmt, DefaultTransactionTimeout, pool settings) and `Ping(ctx, dbClient)` helper to run `SELECT 1`.
Misc: imports & gitignore `event-indexer/indexer/block_result.go`, `.gitignore`	Reorders imports in `block_result.go` (no behavior change). Adds `event-indexer` to `.gitignore`.
Tooling config removed `.golangci.yml`	Removes golangci-lint configuration block (linters, settings, excludes, run.tests).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant K as Kafka Consumer
  participant I as Indexer
  participant W as Worker[1..N]
  participant DLQ as DLQ
  participant S as Sentry

  rect rgb(245,250,255)
    note over I: Startup — init buffered workChan\nspawn N workers
    I->>W: workers ready
  end

  loop Poll every 1s
    K->>I: ReadMessage(timeout=1s)
    alt Message received
      I-->>W: enqueue message on workChan
    else Timeout / none
      I-->>I: continue polling (non-fatal)
    end
  end

  par For each dequeued message
    W->>W: create per-message context\nset Sentry scope (worker_id)
    W->>W: process message
    alt Success
      W-->>K: commit offset
    else Failure
      W-->>DLQ: publish to DLQ
      W-->>K: commit offset (after DLQ)
      W-->>S: capture error
    end
  end

  rect rgb(255,245,240)
    note over I,W: Shutdown
    I-->>I: close workChan
    W-->>W: drain remaining messages and exit
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I thump my paws—more workers in a line,
Messages hop in, each handled just fine.
DB batches grow, timeouts trim the drag,
Sentry marks my whiskers, each eager flag.
I tunnel, I index—carrots for the bag. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "feat: add worker pool and optimize message processing" accurately reflects the primary changes in the pull request. The changeset's main focus is adding concurrent worker pool infrastructure in event-indexer/indexer/indexer.go (new Config.MaxWorkers field, buffered work channel, worker goroutines, and messageWorker function) and optimizing message processing through reduced timeouts, non-fatal timeout handling, and concurrent dispatch instead of synchronous polling. The title is concise, specific, and clearly communicates these core architectural changes without claiming to address secondary modifications like database configuration updates or linter config removal, which aligns with the expected scope of PR titles.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/worker-pool

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 1fd8baf and 02a639f.

📒 Files selected for processing (1)

.golangci.yml (0 hunks)

💤 Files with no reviewable changes (1)

.golangci.yml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (go)

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

@SeUkKim

Docstrings generation was requested by @SeUkKim. * #172 (comment) The following files were modified: * `pkg/db/db.go`

coderabbitai · 2025-10-01T03:11:36Z

Note

Generated docstrings for this pull request at #173

@SeUkKim

Docstrings generation was requested by @SeUkKim. * #172 (comment) The following files were modified: * `pkg/db/db.go` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

pkg/db/db.go (1)
29-33: Review the 5-minute transaction timeout and consider connection pool settings.

DefaultTransactionTimeout is set to QueryTimeout (5 minutes), which is unusually long and may lead to prolonged locks and connection exhaustion under load, especially with the new worker pool. Additionally, connection pool settings (max open/idle connections, connection lifetime) are not configured.

Consider:

Reducing DefaultTransactionTimeout to a more typical value (e.g., 30-60 seconds) unless long-running transactions are intentional.

Adding connection pool configuration:
 func NewClient(databaseURL string) (*gorm.DB, error) {
-	return gorm.Open(postgres.Open(databaseURL), &gorm.Config{DefaultTransactionTimeout: QueryTimeout, PrepareStmt: true})
+	db, err := gorm.Open(postgres.Open(databaseURL), &gorm.Config{
+		DefaultTransactionTimeout: 30 * time.Second, // or keep QueryTimeout if needed
+		PrepareStmt:              true,
+	})
+	if err != nil {
+		return nil, err
+	}
+	
+	sqlDB, err := db.DB()
+	if err != nil {
+		return nil, err
+	}
+	
+	// Configure connection pool for worker pool usage
+	sqlDB.SetMaxOpenConns(25)        // adjust based on worker count
+	sqlDB.SetMaxIdleConns(5)
+	sqlDB.SetConnMaxLifetime(time.Hour)
+	sqlDB.SetConnMaxIdleTime(10 * time.Minute)
+	
+	return db, nil
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 27cdd44 and 267a6b3.

📒 Files selected for processing (1)

pkg/db/db.go (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (go)

🔇 Additional comments (2)

pkg/db/db.go (2)

35-39: LGTM!

The Ping function correctly verifies database connectivity using a standard SELECT 1 query with proper context handling.

18-18: BatchSize increase safe: even for tables with up to ~20 columns, 200×20 = 4 000 parameters per batch, well under PostgreSQL’s 32 767 limit.

… pool settings for worker pool usage

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

pkg/db/db.go (1)

45-48: Make DB connection pool settings configurable and align with worker count
Default workers flag (--workers) defaults to 10 but can be increased—ensure MaxOpenConns (currently 25) is configurable and set ≥ NumWorkers to prevent connection exhaustion.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 267a6b3 and 628ca90.

📒 Files selected for processing (1)

pkg/db/db.go (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (go)

🔇 Additional comments (4)

pkg/db/db.go (4)

18-18: LGTM: Batch size increase aligns with concurrent processing.

Doubling the batch size reduces database round-trips and complements the worker pool mentioned in the PR objectives.

32-35: LGTM: Prepared statement caching enabled.

Enabling PrepareStmt improves performance for repeated queries, which is beneficial given the worker pool processing pattern.

53-57: LGTM: Standard health check implementation.

The Ping function correctly implements a database health check using SELECT 1 with context-based timeout control.

21-21: Confirm 30s timeout covers all batch operations.

There are 28 CreateInBatches calls in pkg/db/db.go; under peak load with BatchSize=200 and concurrent workers, ensure these complete within the 30s QueryTimeout to avoid unintended failures.

event-indexer/indexer/indexer.go

coderabbitai

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 628ca90 and a6b2acc.

📒 Files selected for processing (2)

event-indexer/cmd/indexer/cmd.go (5 hunks)
event-indexer/indexer/indexer.go (5 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (go)

🔇 Additional comments (8)

event-indexer/indexer/indexer.go (5)
361-361: Reduced ReadMessage timeout improves shutdown responsiveness.

The timeout reduction from 10 seconds to 1 second allows faster response to shutdown signals while maintaining efficient polling.

372-377: Good pattern for non-blocking dispatch with cancellation.

The nested select properly handles both message dispatch and shutdown cancellation, preventing blocked sends when the channel is full.

396-396: Verify context.Background() usage.

Using context.Background() means in-flight message processing won't be interrupted when stopCtx is cancelled. This is likely intentional to avoid partial processing, but it means workers may continue processing for up to 10 minutes (based on max.poll.interval.ms at lines 121, 134) after shutdown is requested.

If this behavior is unintended, consider using a derived context with timeout:
ctx, cancel := context.WithTimeout(stopCtx, 5*time.Minute)
defer cancel()
348-421: Message ordering may be lost with concurrent processing.

The worker pool processes messages concurrently, which means messages from the same partition can be processed out of order. If block height ordering is critical for your indexing logic, this could cause issues.

Verify whether out-of-order processing is acceptable for your use case. If ordering is required, consider:

Processing messages from each partition in a dedicated worker

Using a sequential queue per partition

348-421: Align DB connection pool with worker count. The database is configured with SetMaxOpenConns(25) and SetMaxIdleConns(10) in pkg/db/db.go; ensure f.config.MaxWorkers does not exceed these limits or make the pool settings configurable to prevent connection exhaustion.
event-indexer/cmd/indexer/cmd.go (3)

30-30: Constant follows established naming pattern.

The FlagMaxWorkers constant is consistent with other flag definitions in the file.

80-80: MaxWorkers properly wired to indexer config.

The value flows correctly from CLI flag to indexer configuration.

134-134: Flag definition follows established pattern.

The flag registration is consistent with other flags in the command.

event-indexer/cmd/indexer/cmd.go

event-indexer/indexer/indexer.go

…th wait group

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

event-indexer/cmd/indexer/cmd.go (1)

112-115: LGTM! Validation issue resolved.

The code now properly validates that maxWorkers is positive, addressing the concern from the previous review. The fallback to 10 on parse errors or non-positive values is appropriate.

event-indexer/indexer/indexer.go (1)

354-362: LGTM! Worker pool implementation with proper synchronization.

The worker pool correctly uses a WaitGroup to track worker goroutines and ensures graceful shutdown by:

Closing the work channel when stopCtx is cancelled

Waiting for all workers to complete via wg.Wait()

Handling the shutdown signal both in the main loop and when enqueuing messages

The 1-second Kafka polling timeout is appropriate for responsive shutdown.

Also applies to: 367-369, 382-391

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between a6b2acc and e4bd25c.

📒 Files selected for processing (2)

event-indexer/cmd/indexer/cmd.go (5 hunks)
event-indexer/indexer/indexer.go (7 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (go)

🔇 Additional comments (2)

event-indexer/cmd/indexer/cmd.go (1)

30-30: LGTM! MaxWorkers flag integration is correct.

The flag constant, flag binding, config wiring, and environment-to-flag flow are all properly implemented.

Also applies to: 54-54, 80-80, 134-134

event-indexer/indexer/indexer.go (1)

420-425: LGTM! Worker-specific observability enhancements.

The addition of worker_id to Sentry scope and worker identifiers in DLQ/commit error logs provides excellent visibility for debugging and monitoring concurrent worker behavior.

Also applies to: 433-433, 440-440

event-indexer/indexer/indexer.go

…age processing logic

feat: add worker pool and optimize message processing

27cdd44

coderabbitai bot added a commit that referenced this pull request Oct 1, 2025

📝 Add docstrings to feat/worker-pool

9750f37

Docstrings generation was requested by @SeUkKim. * #172 (comment) The following files were modified: * `pkg/db/db.go`

coderabbitai bot mentioned this pull request Oct 1, 2025

📝 Add docstrings to feat/worker-pool #173

Merged

📝 Add docstrings to feat/worker-pool (#173)

267a6b3

Docstrings generation was requested by @SeUkKim. * #172 (comment) The following files were modified: * `pkg/db/db.go` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

SeUkKim requested review from songwongtp and traviolus October 1, 2025 09:30

fix(db): reduce query timeout(5min to 30sec) and configure connection…

628ca90

… pool settings for worker pool usage

coderabbitai bot reviewed Oct 2, 2025

View reviewed changes

songwongtp reviewed Oct 3, 2025

View reviewed changes

event-indexer/indexer/indexer.go Outdated Show resolved Hide resolved

feat(indexer): add configurable max workers option (default: 10)

a6b2acc

coderabbitai bot reviewed Oct 10, 2025

View reviewed changes

event-indexer/cmd/indexer/cmd.go Show resolved Hide resolved

event-indexer/indexer/indexer.go Show resolved Hide resolved

event-indexer/indexer/indexer.go Show resolved Hide resolved

event-indexer/indexer/indexer.go Outdated Show resolved Hide resolved

fix(indexer): handle invalid max workers and ensure worker cleanup wi…

e4bd25c

…th wait group

coderabbitai bot reviewed Oct 10, 2025

View reviewed changes

event-indexer/indexer/indexer.go Show resolved Hide resolved

event-indexer/indexer/indexer.go Outdated Show resolved Hide resolved

fix(indexer): improve error handling for max workers and enhance mess…

a69cf93

…age processing logic

SeUkKim requested a review from songwongtp October 10, 2025 09:19

SeUkKim added 2 commits October 16, 2025 14:40

chore: update .gitignore and sort imports

1fd8baf

chore: delete .golangci.yml

02a639f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add worker pool and optimize message processing #172

feat: add worker pool and optimize message processing #172

SeUkKim commented Oct 1, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 1, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 1, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add worker pool and optimize message processing #172

Are you sure you want to change the base?

feat: add worker pool and optimize message processing #172

Conversation

SeUkKim commented Oct 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot commented Oct 1, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SeUkKim commented Oct 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 1, 2025 •

edited

Loading