-
Notifications
You must be signed in to change notification settings - Fork 0
Parallelization #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Are we intending to redirect stdout/error streams to be viewable in real time via main interface or are we more interested just logging the output of said streams and displaying after the task is completed? Also something to note here, if we intend to run each individual task in its own process, a 'process pool' of some sort should be considered so we don't have to constantly start and tear down process objects (which is quite expensive.) |
I'd consider a live view (probably via a callback) into metrics/logs/etc to be an addon feature. Redirecting stdout/stderr to the main process has benefits without it however. If you're running python from a terminal then you won't get any output from child processes (in my experience). If you're running it under a job supervisor (systemd, supervisorctl, etc) then it likewise won't appear in their logs.
From memory process startup on linux takes ~100 microseconds (can't find a source rn) so it's not a big cost compared to experiments themselves (which could easily take multiple hours). |
Also assigned: @NeedsSoySauce |
@NeedsSoySauce So I did a bit of thinking about this and I think the following is a good set of goals for the 2 weeks
Let me know if you think of anything that is missing / anything else we should aim to do - Also if you want to work a specific task let me know so we don't both end up working on the same ones. |
@Dewera Looks good. For a MVP I think this is a good set of goals. The priority queue part can probably be done later (I don't think it's needed for a MVP). Regarding "Configure the above method to take advantage of parallelism available on system", what do you mean by this? Are you referring to something like detecting what kind of hardware/environment the user is running things and adjusting our 'parallelization strategy' based on that? As for what tasks I'd be keen on - any of them to be honest. The redirecting io part and actually spawning processes parts sound cool to me, but I have little experience with either. Do we plan to use something like joblib to take care of some of these tasks for us as well? |
Output is apparently an ipython specific issue. Up to you if you want to tackle it. |
@NeedsSoySauce I'm not familiar with how parallel code works on python and if it sets it up automatically for you so maybe that task is irrelevant. If possible, it would make sense to add this into the configuring options though i.e. don't use more than x number of threads. Agree with the priority queue, we can just use a standard queue for now. As for your query regarding spawning processes, again my knowledge of the python ecosystem is limited, but I would say we use subprocess to spawn external processes and either pass python code directly or wrap in a lambda and pass that around. Really up to whatever is possible/works best, |
@zacps Probably good to get your input here - Are we intending to be working with lambdas or raw python code when passing user intent around the different methods? From my understanding it's easy to serialise and deserialise between the 2, however, just so I can keep it in mind when developing. |
Not sure what you mean by this? Lambdas should be supported but they may be significantly harder than plain functions depending on your implementation. I think Options:
Non-options:
|
I mean in terms of running the actual tasks, what is being given to us? Are we receiving a shell script, python lambda function, python code etc that we need to forward to the processes we are spawning? I don't currently know what we are working with on our end. |
Assume an arbitrary |
@zacps @Dewera Do we want this to be synchronous? e.g. a user submits the tasks they want to run and we only return after everything has been completed, or do we want this to be asynchronous? I took a look at joblib and it seems that it's purely synchronous, so is that all we want? Just wondering if this could get in the way of us e.g. adding progress reporting/notifications. |
Sync for now at least. |
Extra thing to consider: What happens when an exception is raised in a task? Options:
For now we probably want the first behavior but at some point we'll probably also want the third. |
Me and @NeedsSoySauce briefly discussed this last Thursday. The thinking was to start off with the first option and then eventually go on to let the user choose how they want it to play out (essentially an option in the configuration.) |
I'm going to close this, followups can be their own issues. |
Each run should be run in parallel if possible.
The library I was using to do this is joblib with the
loky
backend.Initially let's only care about a single machine running a single threaded task on multiple cores. This is simple, but a good place to start.
The main complication here is that each task has to run in a separate process, otherwise we'll run afoul of contention due to the GIL.
Joblib's loky backend requires all i/o is serialized to disk and then deserialized to pass it between processes. From memory I had to do some funky things to serialize
functools.partial
. We should try and hide this from the user as much as possible.Additionally it doesn't forward stdout/stderr to the parent process so all
print/println/...
calls get voided. Fixing this would be great, patching joblib/loky might be the best solution but I haven't investigated in detail.The text was updated successfully, but these errors were encountered: