Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions desmod/component.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,23 @@ def elab_hook(self):
"""
pass

def post_fail(self):
"""Recursively run post-fail hooks."""
for child in self._children:
child.post_fail()
self.post_sim_fail_hook()

def post_sim_fail_hook(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For naming, I would advocate on_exception() and exception_hook().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, replace post_sim_fail_hook() with on_exception() and post_fail() with on_exception()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post_sim_fail_hook() --> exception_hook()
post_fail() --> on_exception()

But this is naming is a secondary issue. The big issue is whether this paradigm is actually viable/appropriate.

"""Hook called after simulation fails.

Component subclasses may override `post_sim_fail_hook()` to inject
behavior in case the simulation fails. Note that
`post_sim_fail_hook()` will not be called when simulation is
successful. Instead, :meth:`post_sim_hook` method will be called.

"""
pass

def post_simulate(self):
"""Recursively run post-simulation hooks."""
for child in self._children:
Expand Down
3 changes: 2 additions & 1 deletion desmod/simulation.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,7 @@ def simulate(config, top_type, env_type=SimEnvironment, reraise=True,
top_type.pre_init(env)
env.tracemgr.flush()
with progress_manager(env):
_dump_dict(config_file, config)
top = top_type(parent=None, env=env)
top.elaborate()
env.tracemgr.flush()
Expand All @@ -183,6 +184,7 @@ def simulate(config, top_type, env_type=SimEnvironment, reraise=True,
except BaseException as e:
env.tracemgr.trace_exception()
result['sim.exception'] = repr(e)
top.post_fail()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If either top_type.pre_init() or top_type() fail, then the top local will not exist when entering this exception handler. This would lead to a NameError here where we reference top.

To retain this strategy of calling top.post_fail(), yet another nested try/except would have to be added after top is initialized.

The new indentation level would be awkward, but it also means that the scope of error catching would exclude top_type.__init__() and top_type.pre_init(). Anyone writing a Component that implements a post_fail_hook() would need to comprehend this limitation. It also means that there would not be a solution to handling exceptions raised during top.pre_init() or top.__init__().

A further complication is that post_fail_hook() methods might be called either during or after post_simulate(). Thus post_fail_hook() implementations would have to take care to idempotently tear-down any resources that might have been affected by post_simulate().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the possibility of top being unavailable here.

Yes, you are right in the sense that anyone writing a Component with a post_fail_hook() should be aware that a valid Component exists for Component.post_fail_hook() to be called. I don't know if that's a big deal. Another possibility is to use have a Component.post_fail_hook() and SimEnvironment.post_fail_hook() to handle the case where Component is invalid. I'm not sure if that's necessary.

I don't understand what you mean by tear-down any resources that might have been affected by post_simulate(). Can you elaborate on what you mean by resources?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding resource teardown, the stated goal for this changeset was to allow components to clean-up in the event of an unhandled exception. I'm using "resource" to refer to anything that would be involved in this clean-up; i.e. in the sense of RAII.

The key issue I'm pointing out is that if a component has resources to finalize/destroy/close/deallocate, then it will ostensibly be doing that either in post_simulate() or get_result() in the non-exception paths. Since proposed post_fail_hook() is called from a try/except that covers elaborate, run, post_simulate, and get_result, that means post_fail_hook could end up being called before, during, or after post_simulate; or before or during get_result. Thus resources may already be deallocated prior to post_fail_hook().

The requirement for deallocation of resources to be idempotent is not particularly unusual, but it does impose some cognitive burden on people writing componets with a post_fail_hook(). We should be aware that adding post_fail_hook has this cost. And it's docstring should probably highlight the wide variety of states the top_type instance might be in when an exception triggers post_fail_hook.


Regarding the possibility of having a SimEnvironment.post_fail_hook(), adding hooks to SimEnvironment would be at odds with the existing paradigm for extending SimEnvironment's behavior by subclassing with overriding methods.

As an aside, I chose to provide hooks for Component because there is a recursive nature to elaborate, post_simulate, and get_result that would make it some what tricky for users of Component to place their super method calls correctly. Using hooks abstracts away that complexity of users needing to know where/when/whether to call the super methods.

SimEnvironment, on the other hand, just has a single instance, so it is not such a big deal for a subclass to override its methods. For the problem at hand, overriding SimEnvironment.run() would be a reasonable thing for a subclass to do if it wanted to handle simulation-time errors in a special manner.

I wonder if it is sufficient for most models to attach resources that need cleanup to the SimEnvironment instance and wrap SimEnvironment.run() in a try/except block in a SubEnvironment.run() method?

raise
else:
result['sim.exception'] = None
Expand All @@ -192,7 +194,6 @@ def simulate(config, top_type, env_type=SimEnvironment, reraise=True,
result['sim.now'] = env.now
result['sim.time'] = env.time()
result['sim.runtime'] = timeit.default_timer() - t0
_dump_dict(config_file, config)
_dump_dict(result_file, result)
except BaseException as e:
if reraise:
Expand Down