Include Similar Documents in Title Generation Prompt for Improved Naming Consistency #660

lukaskoebe · 2025-09-27T22:42:42Z

Motivation

Since I was not entirely happy with the consistency of the generated titles of similar documents (like monthly invoices etc.), I wanted to give the llms some context on how similar documents were named.

Implementation

Therefore I added a feature where paperless-gpt uses the similarity search api of paperless-ngx in order to obtain the 5 closest documents that are currently not marked for processing by paperless-gpt (in order to not get unprocessed titles). Then it extracts the titles from those documents and provides them as a reference.

In my experience this improves the quality and consistency of titles significantly.

I had this running for a few days now already, maybe you'll find it interesting and check it out if you like the proposal

Summary by CodeRabbit

New Features
- Smarter title suggestions: the app now references up to five similar documents to align new titles with existing naming patterns.
- Enhanced prompt content: when available, similar document titles are shown to guide consistent, informative naming.
- Resilient behavior: title generation continues even if similar documents can’t be fetched.
Tests
- Added comprehensive tests covering similar-document retrieval, exclusion rules, error paths, and empty-result scenarios.

…onsistency

When disabling the tag generation using AUTO_GENERATE_TAGS=false there was the possibility for tags to become null, in case newTagIDs wasn't initialized yet.

coderabbitai · 2025-09-27T22:42:49Z

Walkthrough

Introduces retrieval of similar documents via a new client method and integrates their titles into the LLM title-suggestion template. Updates getSuggestedTitle signature to accept documentID, fetches up to 5 similar documents, and injects their titles into prompt data. Adjusts tests, template, interface, and client implementation accordingly.

Changes

Cohort / File(s)	Summary
Title suggestion flow `app_llm.go`, `app_llm_test.go`, `default_prompts/title_prompt.tmpl`	`getSuggestedTitle` now accepts `documentID`; fetches similar documents and passes their titles into template data; logs retrieval and counts; tests updated for new signature and prompt assertions; template adds conditional `SimilarDocumentTitles` list.
Paperless client: similar-docs `paperless.go`, `paperless_test.go`, `types.go`	Adds `PaperlessClient.GetSimilarDocuments(ctx, documentID, maxResults)` with tag-exclusion logic, API calls, mapping, and self-skip; initializes `newTagIDs` as `[]int{}` in `UpdateDocuments`; tests cover success, empty, API errors, and exclusion behavior; interface extended with `GetSimilarDocuments`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User as Caller
  participant App as App
  participant Client as PaperlessClient
  participant LLM as LLM

  User->>App: generateDocumentSuggestions(ctx, documentID, content, originalTitle)
  App->>App: getSuggestedTitle(ctx, documentID, content, originalTitle)
  App->>Client: GetSimilarDocuments(documentID, max=5)
  alt similar docs retrieved
    Client-->>App: []Document (titles)
    App->>App: Build template data incl. SimilarDocumentTitles
  else fetch fails
    Client--x App: error
    App->>App: Proceed with empty SimilarDocumentTitles
  end
  App->>LLM: Prompt with content, originalTitle, SimilarDocumentTitles?
  LLM-->>App: suggestedTitle
  App-->>User: suggestedTitle

  rect rgb(235, 245, 255)
    note over Client,App: New/changed flow: similar-doc retrieval and prompt enrichment
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Make the existing document title available in prompt template. #158 — Prior change passing existing/suggested title into getSuggestedTitle; this PR extends by adding documentID and similar-doc context.
Enhance title suggestions with original title #466 — Adjusts title prompt/data; closely related to this PR’s template changes adding SimilarDocumentTitles.
Keep existing tags on documents #119 — Modifies paperless.go tag-handling; related to this PR’s updates in the same area alongside adding GetSimilarDocuments.

Poem

I sniff through stacks with twitchy nose,
Five neighbors whisper title prose.
I hop their names into the prompt,
Then thump a theme—concise and comp.
If friends aren’t found, I’m undeterred—
I still suggest, with ears uncured.
Bun-derful docs, well-titled, heard! 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title succinctly captures the key change by stating the inclusion of similar documents in the title generation prompt to improve naming consistency, and it uses clear, imperative phrasing without unnecessary details or vague terms.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

paperless.go (1)

541-550: Avoid unintended tag clearing; do case-insensitive mapping and skip update when no IDs resolve

Case-sensitive lookup can fail to resolve intended tags (e.g., “Invoice” vs “invoice”), resulting in sending an empty slice and clearing all tags. Map tags case-insensitively and only apply an empty slice when an explicit clear is intended (finalTagNames is empty).

 		if !hasSameTags(originalDoc.Tags, finalTagNames) {
 			originalFields["tags"] = originalDoc.Tags
-			var newTagIDs []int = []int{}
+			// Case-insensitive lookup to avoid misses due to casing
+			ci := make(map[string]int, len(availableTags))
+			for name, id := range availableTags {
+				ci[strings.ToLower(name)] = id
+			}
+			var newTagIDs []int = []int{}
 			for _, tagName := range finalTagNames {
-				if tagID, exists := availableTags[tagName]; exists {
+				if tagID, exists := ci[strings.ToLower(tagName)]; exists {
 					newTagIDs = append(newTagIDs, tagID)
 				}
 			}
-			updatedFields["tags"] = newTagIDs
+			// If user provided non-empty names but none resolved, skip to avoid accidental clearing
+			if len(finalTagNames) > 0 && len(newTagIDs) == 0 {
+				log.Warnf("No tag IDs resolved for document %d; skipping tags update to avoid clearing.", documentID)
+			} else {
+				updatedFields["tags"] = newTagIDs // empty slice [] clears intentionally when finalTagNames is empty
+			}
 		}

🧹 Nitpick comments (9)

default_prompts/title_prompt.tmpl (1)

6-6: Optional copy tweak to guide format matching

Emphasize matching date/number formats from exemplars.

-{{end}}Please try to be consistent with the naming pattern of these similar documents if they provide informative titles.{{end}}
+{{end}}Please match the naming pattern (including any date/number formats) used in these similar documents when they provide informative titles.{{end}}

types.go (1)

164-165: Interface extension looks good; confirm implementers are updated

New method reads well. Please ensure all external mocks/adapters implement it to avoid compile breaks. Consider renaming count → maxResults for clarity in call sites.

paperless_test.go (2)

613-681: Stabilize env/global state in tests

Use t.Setenv for env isolation and also restore global vars manualTag/autoTag to prevent cross-test leakage.

-	originalManualTag := os.Getenv("MANUAL_TAG")
-	originalAutoTag := os.Getenv("AUTO_TAG")
-	defer func() {
-		os.Setenv("MANUAL_TAG", originalManualTag)
-		os.Setenv("AUTO_TAG", originalAutoTag)
-	}()
+	originalManualTag := os.Getenv("MANUAL_TAG")
+	originalAutoTag := os.Getenv("AUTO_TAG")
+	origManualTagVar, origAutoTagVar := manualTag, autoTag
+	defer func() {
+		_ = os.Setenv("MANUAL_TAG", originalManualTag)
+		_ = os.Setenv("AUTO_TAG", originalAutoTag)
+		manualTag, autoTag = origManualTagVar, origAutoTagVar
+	}()
 	
-	// Set the tag values and reinitialize the global variables
-	os.Setenv("MANUAL_TAG", "paperless-gpt")
-	os.Setenv("AUTO_TAG", "paperless-gpt-auto")
+	// Set the tag values and reinitialize the global variables
+	t.Setenv("MANUAL_TAG", "paperless-gpt")
+	t.Setenv("AUTO_TAG", "paperless-gpt-auto")
 	manualTag = "paperless-gpt"
 	autoTag = "paperless-gpt-auto"

682-745: Apply the same env/global isolation here

Mirror the approach from the previous test to avoid leaking state.

-	originalManualTag := os.Getenv("MANUAL_TAG")
-	originalAutoTag := os.Getenv("AUTO_TAG")
-	defer func() {
-		os.Setenv("MANUAL_TAG", originalManualTag)
-		os.Setenv("AUTO_TAG", originalAutoTag)
-	}()
+	originalManualTag := os.Getenv("MANUAL_TAG")
+	originalAutoTag := os.Getenv("AUTO_TAG")
+	origManualTagVar, origAutoTagVar := manualTag, autoTag
+	defer func() {
+		_ = os.Setenv("MANUAL_TAG", originalManualTag)
+		_ = os.Setenv("AUTO_TAG", originalAutoTag)
+		manualTag, autoTag = origManualTagVar, origAutoTagVar
+	}()
 	
-	// Set the tag values and reinitialize the global variables
-	os.Setenv("MANUAL_TAG", "paperless-gpt")
-	os.Setenv("AUTO_TAG", "paperless-gpt-auto")
+	// Set the tag values and reinitialize the global variables
+	t.Setenv("MANUAL_TAG", "paperless-gpt")
+	t.Setenv("AUTO_TAG", "paperless-gpt-auto")
 	manualTag = "paperless-gpt"
 	autoTag = "paperless-gpt-auto"

app_llm_test.go (2)

365-370: Add compile-time interface conformance check

Helps catch future ClientInterface changes early.
 type mockPaperlessClient struct {
 	CustomFields     []CustomField
 	SimilarDocuments []Document
 	Error            error
 }
+
+// Ensure mock implements ClientInterface at compile time
+var _ ClientInterface = (*mockPaperlessClient)(nil)
499-551: Great: validates prompt inclusion of exemplars and final title

Consider also adding one similar doc whose title equals originalTitle to assert it is excluded from the exemplar list.

paperless.go (3)

399-405: Clamp and validate maxResults to avoid 0/negative and overly large requests

Prevents page_size=0 and huge payloads on accidental inputs.

 func (client *PaperlessClient) GetSimilarDocuments(ctx context.Context, documentID int, maxResults int) ([]Document, error) {
+	// Defensive clamp
+	if maxResults <= 0 {
+		maxResults = 5
+	}
+	if maxResults > 50 {
+		maxResults = 50
+	}

414-418: Build the query with url.Values to avoid brittle string concatenation

Safer and handles encoding automatically.

-	path := fmt.Sprintf("api/documents/?ordering=-score&truncate_content=true&more_like_id=%d&page_size=%d", documentID, maxResults)
-	if len(excludeTagIDs) > 0 {
-		path += "&tags__id__none=" + strings.Join(excludeTagIDs, ",")
-	}
+	q := url.Values{}
+	q.Set("ordering", "-score")
+	q.Set("truncate_content", "true")
+	q.Set("more_like_id", strconv.Itoa(documentID))
+	q.Set("page_size", strconv.Itoa(maxResults))
+	if len(excludeTagIDs) > 0 {
+		q.Set("tags__id__none", strings.Join(excludeTagIDs, ","))
+	}
+	path := "api/documents/?" + q.Encode()

452-492: Replace nested scans with O(1) reverse lookups for tags and correspondents

Current loops are O(R * (T + C)). Build id→name maps once and map in O(1).

 	allCorrespondents, err := client.GetAllCorrespondents(ctx)
 	if err != nil {
 		return nil, err
 	}
 
+	// Reverse lookups for O(1) ID→name mapping
+	tagIDToName := make(map[int]string, len(allTags))
+	for name, id := range allTags {
+		tagIDToName[id] = name
+	}
+	corrIDToName := make(map[int]string, len(allCorrespondents))
+	for name, id := range allCorrespondents {
+		corrIDToName[id] = name
+	}
+
 	documents := make([]Document, 0, len(documentsResponse.Results))
 	for _, result := range documentsResponse.Results {
 		// Skip the document itself if it appears in the results
 		if result.ID == documentID {
 			continue
 		}
 
-		tagNames := make([]string, len(result.Tags))
-		for i, resultTagID := range result.Tags {
-			for tagName, tagID := range allTags {
-				if resultTagID == tagID {
-					tagNames[i] = tagName
-					break
-				}
-			}
-		}
+		tagNames := make([]string, 0, len(result.Tags))
+		for _, tagID := range result.Tags {
+			if name, ok := tagIDToName[tagID]; ok {
+				tagNames = append(tagNames, name)
+			}
+		}
 
 		correspondentName := ""
 		if result.Correspondent != 0 {
-			for name, id := range allCorrespondents {
-				if result.Correspondent == id {
-					correspondentName = name
-					break
-				}
-			}
+			if name, ok := corrIDToName[result.Correspondent]; ok {
+				correspondentName = name
+			}
 		}
 
 		documents = append(documents, Document{
 			ID:            result.ID,
 			Title:         result.Title,

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e6a9dd1 and 2228cf5.

📒 Files selected for processing (6)

app_llm.go (2 hunks)
app_llm_test.go (7 hunks)
default_prompts/title_prompt.tmpl (1 hunks)
paperless.go (2 hunks)
paperless_test.go (2 hunks)
types.go (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (5)

types.go (1)

web-app/src/DocumentProcessor.tsx (1)

Document (9-15)

paperless_test.go (1)

types.go (2)

GetDocumentApiResponseResult (21-45)

GetDocumentsApiResponse (11-17)

paperless.go (1)

types.go (3)

Document (77-87)

GetDocumentsApiResponse (11-17)

Correspondent (122-138)

app_llm.go (1)

main.go (1)

App (109-127)

app_llm_test.go (3)

main.go (1)

App (109-127)

paperless.go (1)

CustomField (42-46)

types.go (1)

Document (77-87)

🔇 Additional comments (14)

default_prompts/title_prompt.tmpl (1)

4-6: Nice addition: contextual exemplars for title consistency

Conditional block and list rendering look correct and concise.

paperless_test.go (4)

481-539: Good coverage of happy-path retrieval and query verification

Assertions validate ordering, truncation, more_like_id, and page_size; result mapping checks are clear.

540-570: LGTM: no-results path exercised cleanly

Verifies empty slice without errors.

571-595: LGTM: server error path validated

Error assertions and message prefix check are appropriate.

596-612: LGTM: tag API error path validated

Covers prerequisite failure before documents call.

app_llm.go (2)

192-196: Template data wiring looks correct

Includes Language, Content, Title, and SimilarDocumentTitles; works with token budgeting.

468-469: Call-site update LGTM

Passing documentID through maintains the new signature without changing behavior otherwise.

app_llm_test.go (6)

105-111: Wiring mock client into App is correct

Prevents nil Client in title generation tests.

162-163: Updated call-site compiles against new signature

Document ID plumbed through as intended.

272-279: Same here: proper App initialization for title tests

Client provided; avoids panics in getSuggestedTitle.

414-419: Mock GetSimilarDocuments implementation is fine

Simple passthrough suits the tests.

553-590: LGTM: prompt omission when no similar docs

Covers the empty-slice branch well.

592-629: LGTM: resilient to lookup failure

Ensures feature degrades gracefully and still returns a title.

paperless.go (1)

543-543: LGTM: initialize to empty slice, not nil, for predictable PATCH semantics

Ensures the API receives [] rather than null when clearing tags.

Please confirm the Paperless API treats "tags": [] as “clear all tags” and rejects "tags": null. If null is tolerated, behavior should still be consistent across endpoints.

coderabbitai · 2025-09-27T22:48:28Z

app_llm.go

+func (app *App) getSuggestedTitle(ctx context.Context, documentID int, content string, originalTitle string, logger *logrus.Entry) (string, error) {
 	likelyLanguage := getLikelyLanguage()

+	// Fetch similar documents to help with title consistency
+	var similarDocumentTitles []string
+	similarDocs, err := app.Client.GetSimilarDocuments(ctx, documentID, 5) // Get up to 5 similar documents
+	if err != nil {
+		// Log the error but don't fail the title generation
+		logger.Debugf("Failed to fetch similar documents for title consistency: %v", err)
+	} else {
+		// Extract titles from similar documents
+		for _, doc := range similarDocs {
+			if doc.Title != "" && doc.Title != originalTitle {
+				similarDocumentTitles = append(similarDocumentTitles, doc.Title)
+			}
+		}
+		logger.Debugf("Found %d similar documents for title consistency", len(similarDocumentTitles))
+	}
+


⚠️ Potential issue | 🔴 Critical

Nil-pointer risk when Client is unset; also dedup similar titles

If App.Client is nil, this will panic. Guard it and deduplicate titles to reduce prompt noise.

- // Fetch similar documents to help with title consistency - var similarDocumentTitles []string - similarDocs, err := app.Client.GetSimilarDocuments(ctx, documentID, 5) // Get up to 5 similar documents - if err != nil { - // Log the error but don't fail the title generation - logger.Debugf("Failed to fetch similar documents for title consistency: %v", err) - } else { - // Extract titles from similar documents - for _, doc := range similarDocs { - if doc.Title != "" && doc.Title != originalTitle { - similarDocumentTitles = append(similarDocumentTitles, doc.Title) - } - } - logger.Debugf("Found %d similar documents for title consistency", len(similarDocumentTitles)) - } + // Fetch similar documents to help with title consistency + var similarDocumentTitles []string + if app.Client != nil { + similarDocs, err := app.Client.GetSimilarDocuments(ctx, documentID, 5) // Get up to 5 similar documents + if err != nil { + // Log the error but don't fail the title generation + logger.Debugf("Failed to fetch similar documents for title consistency: %v", err) + } else { + seen := make(map[string]struct{}, len(similarDocs)) + for _, doc := range similarDocs { + title := strings.TrimSpace(doc.Title) + if title == "" || title == originalTitle { + continue + } + if _, ok := seen[title]; ok { + continue + } + seen[title] = struct{}{} + similarDocumentTitles = append(similarDocumentTitles, title) + } + logger.Debugf("Found %d similar documents for title consistency", len(similarDocumentTitles)) + } + } else { + logger.Debug("Paperless client is nil; skipping similar document lookup for title consistency") + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (app *App) getSuggestedTitle(ctx context.Context, documentID int, content string, originalTitle string, logger *logrus.Entry) (string, error) {

likelyLanguage := getLikelyLanguage()

// Fetch similar documents to help with title consistency

var similarDocumentTitles []string

similarDocs, err := app.Client.GetSimilarDocuments(ctx, documentID, 5) // Get up to 5 similar documents

if err != nil {

// Log the error but don't fail the title generation

logger.Debugf("Failed to fetch similar documents for title consistency: %v", err)

} else {

// Extract titles from similar documents

for _, doc := range similarDocs {

if doc.Title != "" && doc.Title != originalTitle {

similarDocumentTitles = append(similarDocumentTitles, doc.Title)

}

}

logger.Debugf("Found %d similar documents for title consistency", len(similarDocumentTitles))

}

func (app *App) getSuggestedTitle(

ctx context.Context,

documentID int,

content string,

originalTitle string,

logger *logrus.Entry,

) (string, error) {

likelyLanguage := getLikelyLanguage()

// Fetch similar documents to help with title consistency

var similarDocumentTitles []string

if app.Client != nil {

similarDocs, err := app.Client.GetSimilarDocuments(ctx, documentID, 5) // Get up to 5 similar documents

if err != nil {

// Log the error but don't fail the title generation

logger.Debugf("Failed to fetch similar documents for title consistency: %v", err)

} else {

seen := make(map[string]struct{}, len(similarDocs))

for _, doc := range similarDocs {

title := strings.TrimSpace(doc.Title)

if title == "" || title == originalTitle {

continue

}

if _, ok := seen[title]; ok {

continue

}

seen[title] = struct{}{}

similarDocumentTitles = append(similarDocumentTitles, title)

}

logger.Debugf("Found %d similar documents for title consistency", len(similarDocumentTitles))

}

} else {

logger.Debug("Paperless client is nil; skipping similar document lookup for title consistency")

}

// ... remainder of getSuggestedTitle ...

}

🤖 Prompt for AI Agents

In app_llm.go around lines 168 to 186, the code risks a nil-pointer panic if App.Client is unset and may append duplicate similar titles; first guard the call by checking if app.Client != nil before calling GetSimilarDocuments and log a debug/warn when the client is nil and skip fetching, and second deduplicate the collected similarDocumentTitles (e.g., use a map to ensure each title is only added once) before using or logging them to reduce prompt noise.

coderabbitai · 2025-09-27T22:48:28Z

paperless.go

+	var excludeTagIDs []string
+	for tagName, tagID := range allTags {
+		if tagName == manualTag || tagName == autoTag {
+			excludeTagIDs = append(excludeTagIDs, fmt.Sprintf("%d", tagID))
+		}
+	}
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Confirm filter semantics for tags__id__none and ordering with more_like_id

Please verify that:

tags__id__none excludes any document having any of the provided tag IDs.

ordering=-score is effective when using more_like_id (or unnecessary because similarity already sorts by score).

truncate_content is supported on this endpoint.

🌐 Web query:

Check Paperless‑ngx documents API for: 1) Whether the query parameter "tags__id__none" is supported and its semantics. 2) Whether "more_like_id" results are sortable or already sorted by a "score" and whether "ordering=-score" is valid. 3) Whether "truncate_content" is a valid query parameter on /api/documents/.

💡 Result:

Short answers with sources:

tags__id__none — Not documented / not a supported query parameter. The official API docs list available filters (and examples) but do not mention a tags__id__none operator; issues show tag-filtering is done via the usual field lookups and that tag-exclusion has been a separate bug, implying no special "none" operator. (docs.paperless-ngx.com)

more_like_id / score / ordering — full‑text and "more like" results are returned with a search_hit containing score, rank and highlights, and the docs state full‑text results are always sorted by search score (i.e. best match first). That implies you cannot rely on (or override) sorting by an API ordering like ordering=-score for full‑text/more_like queries. (docs.paperless-ngx.com)

truncate_content on /api/documents/ — Not documented. The official REST API reference for /api/documents/ does not list a truncate_content query parameter, so it should not be considered a supported parameter. (docs.paperless-ngx.com)

If you want, I can: a) show exact example queries to get documents with/without tags using supported filters, or b) try a live request against a specific paperless‑ngx instance to confirm behavior.

Citations:

1: https://docs.paperless-ngx.com/api/?utm_source=openai

2: https://docs.paperless-ngx.com/api/?utm_source=openai

3: https://docs.paperless-ngx.com/api/?utm_source=openai

Remove unsupported API parameters from document queries

The tags__id__none filter isn’t supported by the /api/documents/ endpoint—drop it or implement exclusion client-side.

Full-text queries (including more_like_id) are intrinsically sorted by score; ordering=-score is ignored and should be removed.

truncate_content isn’t a valid query parameter—remove it.

🤖 Prompt for AI Agents

In paperless.go around lines 407 to 413, the code builds query parameters using unsupported API filters: remove any use of the tags__id__none parameter, drop ordering=-score when full-text/more_like_id is used, and remove truncate_content from the request; instead pass only supported parameters to /api/documents/ and implement exclusion client-side by collecting manual/auto tag IDs (as you already do) and filtering the returned documents to exclude any document whose tag IDs contain those IDs; ensure you no longer send tags__id__none, ordering=-score, or truncate_content in the HTTP query.

Lukas Senicourt added 2 commits September 28, 2025 00:23

Add titles of similar documents into title prompt for better naming c…

2ce49f4

…onsistency

fix: issue with tags being null in update document json

2228cf5

When disabling the tag generation using AUTO_GENERATE_TAGS=false there was the possibility for tags to become null, in case newTagIDs wasn't initialized yet.

coderabbitai bot reviewed Sep 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Include Similar Documents in Title Generation Prompt for Improved Naming Consistency #660

Include Similar Documents in Title Generation Prompt for Improved Naming Consistency #660

Uh oh!

lukaskoebe commented Sep 27, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 27, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 27, 2025

Uh oh!

coderabbitai bot Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Include Similar Documents in Title Generation Prompt for Improved Naming Consistency #660

Are you sure you want to change the base?

Include Similar Documents in Title Generation Prompt for Improved Naming Consistency #660

Uh oh!

Conversation

lukaskoebe commented Sep 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Implementation

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lukaskoebe commented Sep 27, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 27, 2025 •

edited

Loading