Skip to content

Conversation

@gernest
Copy link
Contributor

@gernest gernest commented Jun 11, 2018

The code that is currently handling timeouts doesn't reflect what request timeout means, and less optimal. As a result, it makes it hard to reason about properly shutting down a node connection without affecting other parts of the client in async mode.

here is the code

case <-tcc:
nc.requests.Range(func(_, v interface{}) bool {
req := v.(*networkRequest)
if time.Now().After(req.submitted.Add(req.timeout)) {
nc.queuedBytes -= req.numBytes
nc.handleTimeout(req)
nc.requests.Delete(req.handle)
}
return true
})
tcc = time.NewTimer(time.Duration(tci) * time.Nanosecond).C

See, the problem here, first in async, it is guaranteed that the requests object will be huge, because the rate of the requests will far exceed the rate of responses from voltdb. Meaning at every interval, we will be iterating over a thousand of stored requests just to manual checkout if they have timeout. This might seem to work fine with sync/sql mode because the request/response ratio is small so the operation is fast enough.

I am considering a solution where the procedure invocation will be tracking its own timeout. Which means, every request will accurately track its timeout.

By timeout I mean the time between the request being sent to voltdb and receiving a response for the particular request exceeds the duration assigned to the request through the Conn.*Timeout api calls.

Why is this important?

I have tried many ways to research on properly handling dead nodes when in async. And this inconsistency in handling timeouts has been disrupting all options. One of the cases is, lots of requests will be gone through the client, we want the procedure invocation request to cleanup all the resources attached to it with-ought affecting the rest of the requests tracked by the connection when we mark the node connection as closed(dead)

@gernest gernest changed the title [W.I.P] Fix procedure invocation request timeout Fix procedure invocation request timeout Jun 15, 2018
- setwrite deadline to 1 second
- close the connection on any tcp read error
@gernest gernest mentioned this pull request Jun 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant