[Bug] Concurrent managedidentity.Client.AcquireToken() calls cause multiple token requests to identity provider before caching

**Which version of MSAL Go are you using?**
Latest in main

**Where is the issue?**
* Public client
    * [ ] Device code flow
    * [ ] Username/Password (ROPC grant)
    * [ ] Authorization code flow 
* Confidential client
    * [ ] Authorization code flow 
    * [ ] Client credentials:
        * [ ] client secret
        * [ ] client certificate
* Token cache serialization
     * [X] In-memory cache
* Other (please describe)
  * Managed identity client/token caching 

**Is this a new or an existing app?**

The app is in production, although the area with most likelihood of issue have the mitigation installed.

**What version of Go are you using (`go version`)?**

<pre>
$ go version
go version go1.24.3 linux/amd64
</pre>

**What operating system and processor architecture are you using (`go env`)?**

<details><summary><code>go env</code> Output</summary><br><pre>
$ go env
</pre></details>

**Repro**
Adding this unit test to `managedidentity` package:
```go
package managedidentity

import (
	"context"
	"encoding/json"
	"fmt"
	"net/http"
	"sync"
	"testing"
	"time"

	"github.com/AzureAD/microsoft-authentication-library-for-go/apps/internal/base/storage"
	"github.com/AzureAD/microsoft-authentication-library-for-go/apps/internal/mock"
)

func TestAcquireTokenConcurrency(t *testing.T) {
	resource := "https://management.azure.com"
	miType := SystemAssigned()
	setEnvVars(t, DefaultToIMDS)
	before := cacheManager
	defer func() { cacheManager = before }()
	cacheManager = storage.New(nil)

	// Track the number of HTTP requests made to IMDS, and the number of unique tokens received from AcquireToken
	// Optimally, both are supposed to be 1, as the first request should write to the cache, then become reusable
	var requestCount int32
	var requestCountMutex sync.Mutex
	var acquiredTokens []string
	var acquiredTokensMutex sync.Mutex

	tries := 10

	// Assume no caching on token provider server
	mockClient := mock.NewClient()
	for i := 0; i < tries; i++ {
		token := fmt.Sprintf("[%d]", i)
		responseBody, err := json.Marshal(SuccessfulResponse{
			AccessToken: token,
			ExpiresIn:   3600,
			ExpiresOn:   time.Now().Add(time.Hour).Unix(),
			Resource:    resource,
			TokenType:   "Bearer",
		})
		if err != nil {
			t.Fatal(err)
		}

		mockClient.AppendResponse(
			mock.WithHTTPStatusCode(http.StatusOK),
			mock.WithBody(responseBody),
			mock.WithCallback(func(r *http.Request) {
				requestCountMutex.Lock()
				t.Logf("token provider server providing token: %s", token)
				requestCount++
				requestCountMutex.Unlock()
			}),
		)
	}
	client, err := New(miType, WithHTTPClient(mockClient))
	if err != nil {
		t.Fatal(err)
	}

	// Launch multiple goroutines for AcquireToken() simultaneously
	numGoroutines := tries
	var wg sync.WaitGroup
	for i := 0; i < numGoroutines; i++ {
		wg.Add(1)
		go func(routineID int) {
			defer wg.Done()

			// Call AcquireToken in each goroutine
			result, err := client.AcquireToken(context.Background(), resource)
			if err != nil {
				t.Errorf("goroutine %d failed: %v", routineID, err)
				return
			}

			// Capture the token received
			acquiredTokensMutex.Lock()
			t.Logf("AcquireToken() caller receives token: %s", result.AccessToken)
			acquiredTokens = append(acquiredTokens, result.AccessToken)
			acquiredTokensMutex.Unlock()
		}(i)
	}
	wg.Wait()
	// Although, in the current implementation, we expect
	// - Multiple HTTP requests to IMDS until the cache is written and the subsequent call/goroutine sees the cached token
	// - Each new token received from the server overwrites the previous one in cache
	// - Different goroutines may receive different tokens

	uniqueTokens := make(map[string]bool)
	for _, token := range acquiredTokens {
		uniqueTokens[token] = true
	}

	// Log the results to show the race condition
	t.Logf("- Total HTTP requests made: %d", requestCount)
	t.Logf("- All tokens captured: %v", acquiredTokens)
	t.Logf("- Unique tokens received: %d", len(uniqueTokens))
}
```

**Expected behavior**
[AcquireToken(...)](https://github.com/AzureAD/microsoft-authentication-library-for-go/blob/b4b8bfc9569042572ccb82b648ea509075fadb74/apps/managedidentity/managedidentity.go#L318) [sends the request to identity provider](https://github.com/AzureAD/microsoft-authentication-library-for-go/blob/b4b8bfc9569042572ccb82b648ea509075fadb74/apps/managedidentity/managedidentity.go#L379-L383) once, [caches](https://github.com/AzureAD/microsoft-authentication-library-for-go/blob/b4b8bfc9569042572ccb82b648ea509075fadb74/apps/managedidentity/managedidentity.go#L463), and subsequent (concurrent) calls to `AcquireToken(...)` [reuses the cache](https://github.com/AzureAD/microsoft-authentication-library-for-go/blob/b4b8bfc9569042572ccb82b648ea509075fadb74/apps/managedidentity/managedidentity.go#L328-L340).
```
$ go test -v -run TestAcquireTokenConcurrency
=== RUN   TestAcquireTokenConcurrency
    race_condition_test.go:53: token provider server providing token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:98: - Total HTTP requests made: 1
    race_condition_test.go:99: - All tokens captured: [[0] [0] [0] [0] [0] [0] [0] [0] [0] [0]]
    race_condition_test.go:100: - Unique tokens received: 1
```

**Actual behavior**
[Cache read](https://github.com/AzureAD/microsoft-authentication-library-for-go/blob/b4b8bfc9569042572ccb82b648ea509075fadb74/apps/managedidentity/managedidentity.go#L328) of the subsequent (concurrent) calls may occur before [cache write](https://github.com/AzureAD/microsoft-authentication-library-for-go/blob/b4b8bfc9569042572ccb82b648ea509075fadb74/apps/managedidentity/managedidentity.go#L463) by the previous call(s). This have resulted in these calls also sending requests to identity provider, when it could have used the previous one.
Examples:
```
=== RUN   TestAcquireTokenConcurrency
    race_condition_test.go:55: token provider server providing token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:55: token provider server providing token: [1]
    race_condition_test.go:55: token provider server providing token: [2]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [0]
    race_condition_test.go:83: AcquireToken() caller receives token: [1]
    race_condition_test.go:83: AcquireToken() caller receives token: [2]
    race_condition_test.go:100: - Total HTTP requests made: 3
    race_condition_test.go:101: - All tokens captured: [[0] [0] [0] [0] [0] [0] [0] [0] [1] [2]]
    race_condition_test.go:102: - Unique tokens received: 3
```
```
=== RUN   TestAcquireTokenConcurrency
    race_condition_test.go:53: token provider server providing token: [0]
    race_condition_test.go:53: token provider server providing token: [1]
    race_condition_test.go:53: token provider server providing token: [2]
    race_condition_test.go:53: token provider server providing token: [3]
    race_condition_test.go:53: token provider server providing token: [4]
    race_condition_test.go:81: AcquireToken() caller receives token: [1]
    race_condition_test.go:81: AcquireToken() caller receives token: [2]
    race_condition_test.go:81: AcquireToken() caller receives token: [3]
    race_condition_test.go:81: AcquireToken() caller receives token: [3]
    race_condition_test.go:81: AcquireToken() caller receives token: [0]
    race_condition_test.go:81: AcquireToken() caller receives token: [4]
    race_condition_test.go:53: token provider server providing token: [5]
    race_condition_test.go:81: AcquireToken() caller receives token: [5]
    race_condition_test.go:53: token provider server providing token: [6]
    race_condition_test.go:81: AcquireToken() caller receives token: [6]
    race_condition_test.go:53: token provider server providing token: [7]
    race_condition_test.go:81: AcquireToken() caller receives token: [7]
    race_condition_test.go:53: token provider server providing token: [8]
    race_condition_test.go:81: AcquireToken() caller receives token: [8]
    race_condition_test.go:98: - Total HTTP requests made: 9
    race_condition_test.go:99: - All tokens captured: [[1] [2] [3] [3] [0] [4] [5] [6] [7] [8]]
    race_condition_test.go:100: - Unique tokens received: 9
```

**Possible solution**
The evaluation of whether to fetch the new token should be made after the cache is stabilized. Maybe wrap the logic of `AcquireToken()` with a lock?

**Additional context / logs / screenshots**
[One of its uses](https://github.com/Azure/azure-sdk-for-go/blob/8af5755b00fd6847a7a89dbc1996fc03350d73d7/sdk/azidentity/managed_identity_client.go#L189) in azure-sdk-for-go seems to [lock this](https://github.com/Azure/azure-sdk-for-go/blob/8af5755b00fd6847a7a89dbc1996fc03350d73d7/sdk/internal/temporal/resource.go#L78-L140) anyway. However, I don't think this lock requirement is intentional(?), and we might just get lucky that this use case locks this(?).
My use case is not through azure-sdk-for-go, so we have to install a lock like azure-sdk-for-go at the moment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Concurrent managedidentity.Client.AcquireToken() calls cause multiple token requests to identity provider before caching #569

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Concurrent managedidentity.Client.AcquireToken() calls cause multiple token requests to identity provider before caching #569

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions