Skip to content

Conversation

g-despot
Copy link
Contributor

What's being changed:

Docs for 1-bit rotational quantization (RQ).

Type of change:

  • Documentation content updates (non-breaking change to fix/update documentation )

How has this been tested?

  • Local build - the site works as expected when running yarn start

@g-despot g-despot changed the base branch from main to v1-33/main August 22, 2025 05:42
Copy link

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

@weaviate-git-bot
Copy link

Great to see you again! Thanks for the contribution.

beep boop - the Weaviate bot 👋🤖

PS:
Are you already a member of the Weaviate Slack channel?

Copy link

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca

@g-despot g-despot requested a review from databyjp August 27, 2025 12:45
@@ -114,26 +116,53 @@ When SQ is enabled, Weaviate boosts recall by over-fetching compressed results.

:::caution Technical preview

Rotational quantization (RQ) was added in **`v1.32`** as a **technical preview**.<br/><br/>
**8-bit Rotational quantization (RQ)** was added in **`v1.32`** as a **technical preview**.<br/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the "preview" nomenclature as Alvin proposed in the QAgent discussion?

Also, is RQ still in preview?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change all "technical preview" into "preview" for this release and use it going forward.
8 bit is GA

This means that the feature is still under development and may change in future releases, including potential breaking changes.
**We do not recommend using this feature in production environments at this time.**

:::

**Rotational quantization (RQ)** is an untrained 8-bit quantization technique that provides 4x compression while maintaining 98-99% recall on most datasets. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ works in two steps:
**Rotational quantization (RQ)** is an untrained quantization technique that provides significant compression while maintaining high recall on most datasets. Unlike SQ, RQ requires no training phase and can be enabled immediately at index creation. RQ is available in two variants: **8-bit RQ** and **1-bit RQ**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Untrained" sounds a bit unusual to me, and a bit negative maybe. Wdyt about "non-parametric" or just leaving it out?


### 8-bit RQ

8-bit RQ provides 4x compression while maintaining 98-99% recall on most datasets. RQ works in two steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"on most datasets" -> "in internal testing"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, let's not overpromise


1. **Fast pseudorandom rotation**: The input vector is transformed using a fast rotation based on the Walsh Hadamard Transform. This rotation takes approximately 7-10 microseconds for a 1536-dimensional vector. The output dimension is rounded up to the nearest multiple of 64.

2. **Scalar quantization**: Each entry of the rotated vector is quantized to an 8-bit integer. The minimum and maximum values of each individual rotated vector define the quantization interval.

### 1-bit RQ

1-bit RQ is an untrained asymmetric quantization method that provides close to 32x compression as dimensionality increases. This method is inspired by 1-bit RaBitQ and works as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to mention RaBitQ in our docs at all? I don't see much upside. Maybe we could broadly mention just once somewhere that they share some similarities


2. **Asymmetric quantization**:
- **Data vectors**: Quantized using 1 bit per dimension by storing only the sign of each entry
- **Query vectors**: Scalar quantized using 5 bits per dimension during search
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow this is interesting. How do we compare an array of signs to these 5-bit arrays?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will clarify with the team and add later

The rotation step provides multiple benefits. It tends to reduce the quantization interval and decrease quantization error by distributing values more uniformly. It also distributes the distance information more evenly across all dimensions, providing a better starting point for distance estimation.

It's worth noting that RQ rounds up dimensions to multiples of 64 which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.
It's worth noting that both RQ variants round up dimensions to multiples of 64, which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clarify where we are talking about each dimension and cases like here where we talk about the "number of dimensions".

It's worth noting that RQ rounds up dimensions to multiples of 64 which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.
It's worth noting that both RQ variants round up dimensions to multiples of 64, which means that low-dimensional data (< 64 or 128 dimensions) might result in less than optimal compression.

While inspired by extended RaBitQ, this implementation differs significantly for performance reasons. It uses fast pseudorandom rotations instead of truly random rotations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am inclined to suggest leaving out these interleaved discussion of RQ vs RaBitQ.

RaBitQ isn't an available option to the user so I'm not sure how useful these comparisons are.

Most users will be interested in choosing between available algos in Weaviate.

If we want to keep these comments, we could maybe have one subsection where we acknowledge the inspiration and make comparisons there.

Wdyt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, not much value in this info. I just left one reference here mentioning the inspiration from RaBitQ and linking to the original paper


While inspired by extended RaBitQ, this implementation differs significantly for performance reasons. It Uses fast pseudorandom rotations instead of truly random rotations and it employs scalar quantization instead of RaBitQ's encoding algorithm, which becomes prohibitively slow with more bits per entry.
From the user perspective, 1-bit RQ is not a separate quantization method, but rather a configuration setting for RQ.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this sentence.

  • This (from the user perspective) implies that the rest of the docs are not for user consumption.
  • I thought it was pretty self evident that 1-bit RQ is a config setting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove this, it's actually a leftover from when I wanted to include a code snippet

@@ -15,13 +15,18 @@ import JavaCode from '!!raw-loader!/\_includes/code/howto/java/src/test/java/io/

:::caution Technical preview

Rotational quantization (RQ) was added in **`v1.32`** as a **technical preview**.<br/><br/>
**8-bit Rotational quantization (RQ)** was added in **`v1.32`** as a **technical preview**.<br/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment re preview vs technical preview

This means that the feature is still under development and may change in future releases, including potential breaking changes.
**We do not recommend using this feature in production environments at this time.**

:::

[**Rotational quantization (RQ)**](../../concepts/vector-quantization.md#rotational-quantization) is a fast untrained vector compression technique that offers 4x compression while retaining almost perfect recall (98-99% on most datasets).
[**Rotational quantization (RQ)**](../../concepts/vector-quantization.md#rotational-quantization) is a fast untrained vector compression technique. Two RQ variants are available in Weaviate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment re untrained

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants