Add support for `.freeze` for a read-only mode that releases the GVL #7

atesgoral · 2024-10-30T21:09:58Z

Release GVL for parallel search operations

This PR improves multi-threading performance by releasing Ruby's Global VM Lock (GVL) during Faiss search operations, allowing multiple threads to perform searches in parallel.

Changes

Release GVL during search - Wrap search operations in rb_thread_call_without_gvl to allow parallel execution
Ensure thread-safety - Only release GVL for frozen (immutable) indexes to prevent concurrent modifications

Usage

index = Faiss::IndexFlatL2.new(dimensions)
index.add(vectors)
index.freeze  # Makes index immutable and enables parallel searches

index.search(query_vectors, k) # GVL gets released while search is being performed

Notes

Fully backward compatible - non-frozen indexes work as before

ankane · 2024-11-01T18:24:56Z

Hi @atesgoral, thanks for the PR! However, this doesn't seem safe to do without more locking (as add can now be called while search is running). I'm not sure I'd like to maintain the additional complexity, but happy to take another look if you decide to implement it.

https://github.com/facebookresearch/faiss/wiki/Threads-and-asynchronous-calls

tenderlove · 2024-11-01T22:05:37Z

@ankane are you sure that's true? According to the link you sent:

Python interface releases the Global Interpreter Lock for all calls, so using python multithreading will effectively use several cores.

I'm not familiar with the Python extension, but it seems like we should be in the same boat? Am I missing something?

ankane · 2024-11-01T23:55:25Z

Will check out the Python code, but this causes the test to consume all available CPU and hang (which doesn't happen when the GVL is enabled).

--- a/test/index_test.rb
+++ b/test/index_test.rb
@@ -279,17 +279,18 @@ class IndexTest < Minitest::Test
       [1, 1, 2, 1],
       [5, 4, 6, 5],
       [1, 2, 1, 2]
-    ]
+    ] * 100
     index = Faiss::IndexFlatL2.new(4)
     index.add(objects)
 
     concurrency = 0
     max_concurrency = 0
 
-    threads = 2.times.map {
+    threads = 100.times.map {
       Thread.new {
         concurrency += 1
         max_concurrency = [max_concurrency, concurrency].max
+        index.add(objects)
         index.search(objects, 3)
         concurrency -= 1
       }

atesgoral · 2024-11-02T00:31:48Z

@ankane Thanks for having a look.

I didn't need more data or 100 iters to get it to lock. I haven't looked at faiss in detail yet but I see the word "lock" all over the repo. This is enough to cause a deadlock(?):

diff --git a/test/index_test.rb b/test/index_test.rb
index 64e093b..095b8ea 100644
--- a/test/index_test.rb
+++ b/test/index_test.rb
@@ -290,7 +290,13 @@ class IndexTest < Minitest::Test
       Thread.new {
         concurrency += 1
         max_concurrency = [max_concurrency, concurrency].max
+        puts "adding"
+        index.add(objects)
+        puts "added"
+        sleep(10)
+        puts "searching"
         index.search(objects, 3)
+        puts "searched"
         concurrency -= 1
       }
     }

I say "deadlock(?)" because it doesn't act like an forever deadlock, but something gives eventually and the tests ends with a success. 🤔

obie · 2024-11-05T00:09:21Z

keeping 👀 on this

atesgoral · 2025-08-02T15:15:35Z

@ankane I can rejig this to add a read-only mode to an index so it can be used with GVL release in read-only use cases.

Option 1

index.read_only! to lock the instance to a read-only mode where the read operations start releasing GVL while write operations throw.

Option 2

Faiss::Index.load_read_only to do the same as above but at index load time.

Option 3

index.read_only or Faiss::ReadOnly::Index facades to return read-only interfaces where the write operations are either non-existent or throw. The only way this will be safe is if an application ensures that access to an index is only made through a singleton that returns this facade.

test/index_test.rb

Co-authored-by: Ufuk Kayserilioglu <[email protected]> Co-authored-by: Aaron Patterson <[email protected]> Move without_gvl to utils Add parallelism test Simplify config

tavianator · 2025-08-13T19:08:28Z

I updated this PR to hang the decision to drop the GVL off of index.frozen?. Once you freeze an index, you can no longer mutate it so it becomes safe to allow parallel search calls. Let me know if you like that approach!

cfis · 2025-08-14T03:45:39Z

ext/faiss/index.cpp


-        self.search(n, objects.read_ptr(), k, distances.write_ptr(), labels.write_ptr());
+        if (wrapper.is_frozen()) {
+          without_gvl([&] {


This would be helfpul to add to Rice. Pybind does it like this:

https://pybind11.readthedocs.io/en/stable/advanced/misc.html#global-interpreter-lock-gil

I'd be happy to add it to Rice! I don't think it's easy to write a similar API to the pybind11 one since the Ruby ones all take a callback, but I can add something like this without_gvl easily.

Any suggestions for where to put it? Not familiar with the code structure of Rice

Created a PR for Rice here: ruby-rice/rice#313

ankane · 2025-08-15T20:33:11Z

Hi @tavianator, thanks for sharing. I like the simplicity of this approach. Can you share a benchmark of the performance improvement?

Also, I agree with @cfis that it'd be nice to add the GVL code to Rice.

Add a new check to all mutating methods that we're not operating on a frozen Index. After that, it should be safe to drop the GVL for search on a frozen Index, since nothing can be mutating it in parallel.

atesgoral · 2025-08-16T18:51:24Z

@ankane It looks like a win.

tl:dr;

Single-threaded baseline: 979.61 queries/sec
Best multi-threaded (unfrozen): 1036.73 queries/sec
Best multi-threaded (frozen): 3584.14 queries/sec
Overall improvement from freezing: 245.7%
Maximum scaling achieved: 3.66x (45.8% parallel efficiency)

I got Claude Code vibe-code these benchmark scripts: 2367f1c

Output of the more intense one:

GVL Release Intensive Benchmark for Faiss Index
==================================================
Configuration:
  Dimensions: 256
  Index vectors: 50000
  Queries per iteration: 100
  Iterations: 10
  K neighbors: 100
  Thread counts to test: [1, 2, 4, 8]

Generating random data... done!
Creating and training index... done! (50000 vectors in index)

Running benchmarks...
--------------------------------------------------

Testing with 1 thread(s):
  Unfrozen index: 1.021s (979.61 queries/sec)
  Frozen index:   1.012s (988.1 queries/sec)

Testing with 2 thread(s):
  Unfrozen index: 1.018s (982.68 queries/sec)
  Frozen index:   0.566s (1767.32 queries/sec)
  → Freezing improved performance by 79.8% (1.8x speedup)

Testing with 4 thread(s):
  Unfrozen index: 1.038s (963.32 queries/sec)
  Frozen index:   0.292s (3425.27 queries/sec)
  → Freezing improved performance by 255.6% (3.56x speedup)

Testing with 8 thread(s):
  Unfrozen index: 0.965s (1036.73 queries/sec)
  Frozen index:   0.279s (3584.14 queries/sec)
  → Freezing improved performance by 245.7% (3.46x speedup)

==================================================
Performance Summary:
--------------------------------------------------
Thread Count | Unfrozen QPS | Frozen QPS | Improvement
--------------------------------------------------
           1 |       979.61 |      988.1 | 0%
           2 |       982.68 |    1767.32 | +79.8%
             | (1.0x scaling) | (1.8x scaling) |
           4 |       963.32 |    3425.27 | +255.6%
             | (0.98x scaling) | (3.5x scaling) |
           8 |      1036.73 |    3584.14 | +245.7%
             | (1.06x scaling) | (3.66x scaling) |

==================================================
Key Findings:
  • Single-threaded baseline: 979.61 queries/sec
  • Best multi-threaded (unfrozen): 1036.73 queries/sec
  • Best multi-threaded (frozen): 3584.14 queries/sec
  • Overall improvement from freezing: 245.7%
  • Maximum scaling achieved: 3.66x (45.8% parallel efficiency)

atesgoral · 2025-10-11T01:20:48Z

@cfis Is a new Rice release with the GVL-free function calls on the horizon?

cfis · 2025-10-13T02:09:09Z

Yes, will try to push out a release this week. Have been updating documentation first, but almost done with that.

atesgoral mentioned this pull request Oct 30, 2024

GVL release Shopify/faiss-ruby#1

Closed

atesgoral force-pushed the gvl-release branch from cfde05b to c9dca0e Compare October 30, 2024 22:31

atesgoral force-pushed the gvl-release branch from b7dedd3 to 1a3b516 Compare August 2, 2025 14:18

tavianator reviewed Aug 12, 2025

View reviewed changes

test/index_test.rb Outdated Show resolved Hide resolved

Wrap CPU-intensive operations inside rb_thread_call_without_gvl

c155ea0

Co-authored-by: Ufuk Kayserilioglu <[email protected]> Co-authored-by: Aaron Patterson <[email protected]> Move without_gvl to utils Add parallelism test Simplify config

tavianator force-pushed the gvl-release branch 2 times, most recently from a016bd4 to 525069c Compare August 13, 2025 19:07

tavianator force-pushed the gvl-release branch from 525069c to 0634cc5 Compare August 13, 2025 19:15

atesgoral changed the title ~~[WIP] GVL release~~ Add support for .freeze for a read-only mode that releases the GVL Aug 13, 2025

atesgoral marked this pull request as ready for review August 13, 2025 20:00

tavianator force-pushed the gvl-release branch from 0634cc5 to 3e3b325 Compare August 13, 2025 20:22

cfis reviewed Aug 14, 2025

View reviewed changes

Only drop the GVL for frozen Indexes

a2875b1

Add a new check to all mutating methods that we're not operating on a frozen Index. After that, it should be safe to drop the GVL for search on a frozen Index, since nothing can be mutating it in parallel.

atesgoral force-pushed the gvl-release branch from 3e3b325 to a2875b1 Compare August 16, 2025 18:46

tavianator mentioned this pull request Aug 18, 2025

New without_gvl() and with_gvl() functions to release/acquire the GVL ruby-rice/rice#313

Closed

Add support for .freeze for a read-only mode that releases the GVL #7

Are you sure you want to change the base?

Add support for .freeze for a read-only mode that releases the GVL #7

Uh oh!

Conversation

atesgoral commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release GVL for parallel search operations

Changes

Usage

Notes

Uh oh!

ankane commented Nov 1, 2024

Uh oh!

tenderlove commented Nov 1, 2024

Uh oh!

ankane commented Nov 1, 2024

Uh oh!

atesgoral commented Nov 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

obie commented Nov 5, 2024

Uh oh!

atesgoral commented Aug 2, 2025

Option 1

Option 2

Option 3

Uh oh!

Uh oh!

tavianator commented Aug 13, 2025

Uh oh!

cfis Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

tavianator Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

tavianator Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

ankane commented Aug 15, 2025

Uh oh!

atesgoral commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

atesgoral commented Oct 11, 2025

Uh oh!

cfis commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add support for `.freeze` for a read-only mode that releases the GVL #7

Add support for `.freeze` for a read-only mode that releases the GVL #7

atesgoral commented Oct 30, 2024 •

edited

Loading

atesgoral commented Nov 2, 2024 •

edited

Loading

atesgoral commented Aug 16, 2025 •

edited

Loading