-
Notifications
You must be signed in to change notification settings - Fork 1.1k
How force akka system run on its own threadpool and mixed used of cluster client #4419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
we manged to unrealiably reproduce the problem with a small test program. This should help to diagnose the error: See #4432 |
maybe this issue can be closed because the reproduction sample could be enough to find the problems |
FYI: it is possible to use dedicated dispatchers for the different system actor groups. I will give it a try with a dedicated ForkJoinDispatcher for the cluster stuff. Maybe there is some improvement. |
Sorry for not replying soon @Ralf1108 - but yes you should be able to customize the dispatcher for the |
#26816 would probably help |
I think it would. I tried an ad-hoc-ish implementation found that while max performance seemed a little slower (possibly machine constraints since it has a low core count) all sorts of stability issues went away in our stress tests, both around Cluster heartbeats as well as DData issues. |
@to11mtm @ismaelhamed I was thinking about this exact issue today actually - that we have a couple of oustanding issues in Akka.NET's internal actors:
We should port that PR and just put everything on the same dispatcher. And we should up the max internal concurrency setting to 64 just like on the JVM. |
Yep. The key point of that PR is to protect Akka internals from user code. This is specially true in Akka Cluster, where failure to respond to HeartBeats due to thread starvation can cause a lot of unnecessary temporary unreachables. |
@Aaronontheweb was curious if you could clarify:
Is the thought here that Persistence should be running in the 'internal-dispatcher'? Also, on that same note, I notice many persistence plugins (i.e. the SQL ones) usually specify 'default-dispatcher' in their configs. While I know SQL Server is the worst culprit here with it's high risk of blocking during both Read and write, this seems like something else that should be revisited, as DB Access is almost always blocking to some level. |
That's correct.
I think we use |
@Ralf1108 I think what you're asking for can actually be done already via the That will allow the As for this:
This is poorly worded - what it means is: don't use ClusterClient within the cluster, for inter-node communication. Use it when you have an external system that itself isn't part of the cluster (i.e. a web UI) that needs to communicate with an Akka.NET cluster (i.e. some back-end nodes.) Will changing the dispatcher for the receptionist help? |
Does this solve the issue for now?
|
For anyone that might be looking for a workaround to this. Here is the config I am using for now:
|
close via #4511 |
@Aaronontheweb does this mean that we don't need any additional config to prevent starvation issues of this kind? |
@Swoorup nope, the newest nightly builds of Akka.NET should automatically handle this. |
Uh oh!
There was an error while loading. Please reload this page.
Hi,
we are running Akka.Net v1.4.5 on Windows and experiencing random network partitions (sometime twice an hour, sometimes after 3 to 4 ours).
Occasionally we see missing heartbeats in the log:
Sometimes everything works but otherwise we see:
We then programmatically restart the disassociated akka system and they reconnect again.
Interestingly this also happens if there is no load on the system
Because of the log message we think the issue could be that akka system itself gets no thread time to manage its heartbeats (20 sec delayed heartbeat is rather long).
What we did so far:
But this didn't fix the issue.
What we now wanted to do was running the whole actor system on a dedicated thread pool, not only user actors. There is a section in the documentation how to use dedicated threads for actors.
Dispatchers.
In the documentation there is a "Global dispatcher" named:
My question:
How can we change this "Global dispatcher" to use its own thread pool so we can figure out if this fixes our problem?
Our other theory is that when introducing akka.net into our system we started by having a separate akka process and connected to it from our web server via ClusterClient. Recently we integrated signalR in our Webserver. To be able to send messages from the akka process to the signalR component running on the web server we started a second akka sytem there and connected it to the akka process. So the web server is currently a part of the akka system and also uses the cluster client in our legacy code to talk to the akka process as well.
In the cluster client documentation it says:
It says it is not recommend but not why. And there is no hint about what could go wrong.
Does somebody knows more details about how this not recommended approach could affect the system stability?
The text was updated successfully, but these errors were encountered: