cooperative yield in ThreadPool::install() causes unexpected behavior in nested pools

It appears calls to `ThreadPool::install(op)` will start running `op` inside the child thread pool and then cooperatively yield, _potentially switching to a different task_. This causes unexpected behavior with nested iterators using thread pools.  It would be nice if rayon didn't yield on `ThreadPool::install()`, or if the authors believe that it should yield, to place some kind of warning in the documentation for `ThreadPool::install()`.

Consider the following example.  You have an outer loop that performs some operation that requires a lot of memory.  You have an inner loop that uses a lot of cpu but is memory-efficient.  To avoid unbounded memory growth, you will run the outer and inner loops in separate thread pools where `ncpus_outer` is a small number and `ncpus_inner` is a big number.

```rust
let pool_outer = ThreadPoolBuilder::default().num_threads(ncpus_outer).build().unwrap();
let pool_inner = ThreadPoolBuilder::default().num_threads(ncpus_inner).build().unwrap();
pool_outer.install(|| {
    some_par_iter.for_each({ |foo|
        println!("Entered outer loop.");
        let bar = memory_intensive_op(foo);
        pool_inner.install(|| {
            println!("Entered inner pool.");
            thing.into_par_iter().for_each({ |baz|
                cpu_intensive_op(baz);
            });
        });
        println!("Left inner pool.");
    });
});
```

Suppose `ncpus_outer` is equal to one.  You might reasonably expect that only one `bar` will be allocated at any given time, giving you control over memory use.  However, the output of the program may actually look like:

```
Entered outer loop.
Entered inner pool.
Entered outer loop.
Entered inner pool.
Entered outer loop.
Entered outer loop.
Left inner pool.
...
```

This is possible because, rather than blocking until `pool_inner.install()` returns, the outer thread pool will yield and cooperatively schedule another iteration of the outer loop _on the same thread_. This gives you essentially zero control over how many instances of `bar` will be allocated at a given time.

This is an even bigger problem if you have to do some kind of synchronization (yes there are real uses cases for this):

```rust
let pool_outer = ThreadPoolBuilder::default().num_threads(ncpus_outer).build().unwrap();
let pool_inner = ThreadPoolBuilder::default().num_threads(ncpus_inner).build().unwrap();
let mutex = std::sync::Mutex::new(());
pool_outer.install(|| {
    some_par_iter.for_each({ |foo|
        println!("Entered outer loop.");
        let bar = memory_intensive_op(foo);
        let lock = mutex.lock();
        pool_inner.install(|| {
            println!("Entered inner pool.");
            thing.into_par_iter().for_each({ |baz|
                cpu_intensive_op(baz);
            });
        });
        println!("Left inner pool.");
        // Mutex lock released here.
    });
});
```

```
Entered outer loop.
Entered inner pool.
Entered outer loop.
[ deadlock ]
```

The outer thread pool yields to a different task before releasing the mutex, likely leading to a deadlock unless it switches back to the task holding the mutex before it runs out of threads in the outer pool.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cooperative yield in ThreadPool::install() causes unexpected behavior in nested pools #1105

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cooperative yield in ThreadPool::install() causes unexpected behavior in nested pools #1105

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions