[v1.33] 1-bit RQ #144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

g-despot wants to merge 4 commits into v1-33/main from v1-33/rq-1-bit

+68 −19

Contributor

g-despot commented Aug 22, 2025

What's being changed:

Docs for 1-bit rotational quantization (RQ).

Type of change:

Documentation content updates (non-breaking change to fix/update documentation )

How has this been tested?

Local build - the site works as expected when running yarn start


          Add 1-bit RQ changes

ccc5e0b

g-despot changed the base branch from main to v1-33/main

August 22, 2025 05:42

orca-security-eu bot reviewed

View reviewed changes

orca-security-eu bot left a comment

Orca Security Scan Summary

Status	Check	Issues by priority
Passed	Infrastructure as Code	0 0 0 0	View in Orca
Passed	SAST	0 0 0 0	View in Orca
Passed	Secrets	0 0 0 0	View in Orca
Passed	Vulnerabilities	0 0 0 0	View in Orca

weaviate-git-bot commented Aug 22, 2025

Great to see you again! Thanks for the contribution.

beep boop - the Weaviate bot 👋🤖

PS:
Are you already a member of the Weaviate Slack channel?


          Update docs

59f6244

orca-security-eu bot reviewed

View reviewed changes

orca-security-eu bot left a comment •

edited

Loading

Orca Security Scan Summary

Status	Check	Issues by priority
Passed	Secrets	0 0 0 0	View in Orca

g-despot requested a review from databyjp

August 27, 2025 12:45

databyjp reviewed

View reviewed changes

docs/weaviate/concepts/vector-quantization.md Outdated

               :::caution Technical preview
-              Rotational quantization (RQ) was added in **`v1.32`** as a **technical preview**.<br/><br/>
+              **8-bit Rotational quantization (RQ)** was added in **`v1.32`** as a **technical preview**.<br/>

Contributor

databyjp Aug 27, 2025

Should we use the "preview" nomenclature as Alvin proposed in the QAgent discussion?

Also, is RQ still in preview?

Contributor Author

g-despot Aug 28, 2025

Will change all "technical preview" into "preview" for this release and use it going forward.
8 bit is GA

docs/weaviate/concepts/vector-quantization.md Outdated

               This means that the feature is still under development and may change in future releases, including potential breaking changes.
               **We do not recommend using this feature in production environments at this time.**
               :::
-              **Rotational quantization (RQ)** is an untrained 8-bit quantization technique that provides 4x compression while maintaining 98-99% recall on most datasets. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ works in two steps:
+              **Rotational quantization (RQ)** is an untrained quantization technique that provides significant compression while maintaining high recall on most datasets. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ is available in two variants: **8-bit RQ** and **1-bit RQ**.

Contributor

databyjp Aug 27, 2025

"Untrained" sounds a bit unusual to me, and a bit negative maybe. Wdyt about "non-parametric" or just leaving it out?

docs/weaviate/concepts/vector-quantization.md Outdated


		### 8-bit RQ

		8-bit RQ provides 4x compression while maintaining 98-99% recall on most datasets. RQ works in two steps:

Contributor

databyjp Aug 27, 2025

"on most datasets" -> "in internal testing"?

Contributor Author

g-despot Aug 28, 2025

Good point, let's not overpromise

docs/weaviate/concepts/vector-quantization.md Outdated

 . **Fast pseudorandom rotation**: The input vector is transformed using a fast rotation based on the Walsh Hadamard Transform. This rotation takes approximately 7-10 microseconds for a 1536-dimensional vector. The output dimension is rounded up to the nearest multiple of 64.
 . **Scalar quantization**: Each entry of the rotated vector is quantized to an 8-bit integer. The minimum and maximum values of each individual rotated vector define the quantization interval.
+              ### 1-bit RQ
+-bit RQ is an untrained asymmetric quantization method that provides close to 32x compression as dimensionality increases. This method is inspired by 1-bit RaBitQ and works as follows:

Contributor

databyjp Aug 27, 2025

Do we need to mention RaBitQ in our docs at all? I don't see much upside. Maybe we could broadly mention just once somewhere that they share some similarities

docs/weaviate/concepts/vector-quantization.md

+. **Asymmetric quantization**:
+                 - **Data vectors**: Quantized using 1 bit per dimension by storing only the sign of each entry
+                 - **Query vectors**: Scalar quantized using 5 bits per dimension during search

Contributor

databyjp Aug 27, 2025

Oh wow this is interesting. How do we compare an array of signs to these 5-bit arrays?

Contributor Author

g-despot Aug 28, 2025

Will clarify with the team and add later

docs/weaviate/concepts/vector-quantization.md Outdated

               The rotation step provides multiple benefits. It tends to reduce the quantization interval and decrease quantization error by distributing values more uniformly. It also distributes the distance information more evenly across all dimensions, providing a better starting point for distance estimation.
-              It's worth noting that RQ rounds up dimensions to multiples of 64 which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.
+              It's worth noting that both RQ variants round up dimensions to multiples of 64, which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.

Contributor

databyjp Aug 27, 2025

I think we should clarify where we are talking about each dimension and cases like here where we talk about the "number of dimensions".

docs/weaviate/concepts/vector-quantization.md Outdated

-              It's worth noting that RQ rounds up dimensions to multiples of 64 which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.
+              It's worth noting that both RQ variants round up dimensions to multiples of 64, which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.
+              While inspired by extended RaBitQ, this implementation differs significantly for performance reasons. It uses fast pseudorandom rotations instead of truly random rotations.

Contributor

databyjp Aug 27, 2025

I am inclined to suggest leaving out these interleaved discussion of RQ vs RaBitQ.

RaBitQ isn't an available option to the user so I'm not sure how useful these comparisons are.

Most users will be interested in choosing between available algos in Weaviate.

If we want to keep these comments, we could maybe have one subsection where we acknowledge the inspiration and make comparisons there.

Wdyt

Contributor Author

g-despot Aug 28, 2025

True, not much value in this info. I just left one reference here mentioning the inspiration from RaBitQ and linking to the original paper

docs/weaviate/concepts/vector-quantization.md Outdated

    
              While inspired by extended RaBitQ, this implementation differs significantly for performance reasons. It Uses fast pseudorandom rotations instead of truly random rotations and it employs scalar quantization instead of RaBitQ's encoding algorithm, which becomes prohibitively slow with more bits per entry.

              From the user perspective, 1-bit RQ is not a separate quantization method, but rather a configuration setting for RQ.

Contributor

databyjp Aug 27, 2025

I'm not sure about this sentence.

This (from the user perspective) implies that the rest of the docs are not for user consumption.
I thought it was pretty self evident that 1-bit RQ is a config setting

Contributor Author

g-despot Aug 28, 2025

Will remove this, it's actually a leftover from when I wanted to include a code snippet

docs/weaviate/configuration/compression/rq-compression.md Outdated

               :::caution Technical preview
-              Rotational quantization (RQ) was added in **`v1.32`** as a **technical preview**.<br/><br/>
+              **8-bit Rotational quantization (RQ)** was added in **`v1.32`** as a **technical preview**.<br/>

Contributor

databyjp Aug 27, 2025

Same comment re preview vs technical preview

docs/weaviate/configuration/compression/rq-compression.md Outdated

               This means that the feature is still under development and may change in future releases, including potential breaking changes.
               **We do not recommend using this feature in production environments at this time.**
               :::
-              [**Rotational quantization (RQ)**](../../concepts/vector-quantization.md#rotational-quantization) is a fast untrained vector compression technique that offers 4x compression while retaining almost perfect recall (98-99% on most datasets).
+              [**Rotational quantization (RQ)**](../../concepts/vector-quantization.md#rotational-quantization) is a fast untrained vector compression technique. Two RQ variants are available in Weaviate:

Contributor

databyjp Aug 27, 2025

Same comment re untrained

g-despot added 2 commits

August 28, 2025 17:08


          Implement feedback

8738a0d


          Implement feedback

014f984

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet