Unrestricted caching keyed by generated types causes memory leak in multi-threaded regimes

### Confirm this is an issue with the Python library and not an underlying OpenAI API

- [x] This is an issue with the Python library

### Describe the bug

In the `OpenAI.responses.parse` function, as part of validating the formatted response so it matches the required `text_format` (denoted hereinafter `MyClass`), the code creates the type `ParsedResponseOutputMessage[MyClass]`, which is later passed to an unbounded `lru_cache` of `pydantic.TypeAdapter`.

In a multi-threaded setting, `pydantic` generates the type anew, and so its `hash` changes in every run, increasing the cache size ad infinitum. Concretely, a standard webserver that uses the `responses.parse` function is affected.

This issue reproduces under any model, and regardless of user input or target class.

### To Reproduce

Consider the following snippet:
```python
from openai import OpenAI
from pydantic import BaseModel
import psutil


class Fact(BaseModel):
    fact: str


client = OpenAI()
model = "gpt-4.1-nano"

def f():
    _ = client.responses.parse(model=model, input="Give a fun fact", text_format=Fact)
    print(psutil.Process().memory_info().rss / 2**20)
```

`f` invokes a `responses.parse` call, and prints the memory usage (in MB).

When running it on a single thread, the memory usage changes minimally:
```python
for _ in range(10):
    f()
```

However, when running it on multiple threads (such as in default webserver), more memory is used for every request:
```python
import time
import threading

for _ in range(10):
    time.sleep(0.1)
    threading.Thread(target=f).start()
```

### Code snippets

```Python

```

### OS

macOS

### Python version

Python v3.12.11

### Library version

openai v1.107.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unrestricted caching keyed by generated types causes memory leak in multi-threaded regimes #2672

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Code snippets

OS

Python version

Library version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unrestricted caching keyed by generated types causes memory leak in multi-threaded regimes #2672

Description

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Code snippets

OS

Python version

Library version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions