Skip to content

Conversation

itzbhushan
Copy link

We include a post_sim_fail_hook method in desmo and enable
components to inject behavior upon simulation failure. Components
can use this hook to clean up simulation state or artifacts.

We include a `post_sim_fail_hook` method in desmo and enable
components to inject behavior upon simulation failure. Components
can use this hook to clean up simulation state or artifacts.
@coveralls
Copy link

coveralls commented Apr 20, 2017

Coverage Status

Coverage increased (+0.02%) to 93.42% when pulling f71920f on itzbhushan:master into ec346a6 on SanDisk-Open-Source:master.

Copy link
Collaborator

@jpgrayson jpgrayson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of providing a means for a model to have top-level error handling is noble, but I'm not sure having a hook method on Component is ultimately going to be a workable paradigm.

except BaseException as e:
env.tracemgr.trace_exception()
result['sim.exception'] = repr(e)
top.post_fail()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If either top_type.pre_init() or top_type() fail, then the top local will not exist when entering this exception handler. This would lead to a NameError here where we reference top.

To retain this strategy of calling top.post_fail(), yet another nested try/except would have to be added after top is initialized.

The new indentation level would be awkward, but it also means that the scope of error catching would exclude top_type.__init__() and top_type.pre_init(). Anyone writing a Component that implements a post_fail_hook() would need to comprehend this limitation. It also means that there would not be a solution to handling exceptions raised during top.pre_init() or top.__init__().

A further complication is that post_fail_hook() methods might be called either during or after post_simulate(). Thus post_fail_hook() implementations would have to take care to idempotently tear-down any resources that might have been affected by post_simulate().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the possibility of top being unavailable here.

Yes, you are right in the sense that anyone writing a Component with a post_fail_hook() should be aware that a valid Component exists for Component.post_fail_hook() to be called. I don't know if that's a big deal. Another possibility is to use have a Component.post_fail_hook() and SimEnvironment.post_fail_hook() to handle the case where Component is invalid. I'm not sure if that's necessary.

I don't understand what you mean by tear-down any resources that might have been affected by post_simulate(). Can you elaborate on what you mean by resources?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding resource teardown, the stated goal for this changeset was to allow components to clean-up in the event of an unhandled exception. I'm using "resource" to refer to anything that would be involved in this clean-up; i.e. in the sense of RAII.

The key issue I'm pointing out is that if a component has resources to finalize/destroy/close/deallocate, then it will ostensibly be doing that either in post_simulate() or get_result() in the non-exception paths. Since proposed post_fail_hook() is called from a try/except that covers elaborate, run, post_simulate, and get_result, that means post_fail_hook could end up being called before, during, or after post_simulate; or before or during get_result. Thus resources may already be deallocated prior to post_fail_hook().

The requirement for deallocation of resources to be idempotent is not particularly unusual, but it does impose some cognitive burden on people writing componets with a post_fail_hook(). We should be aware that adding post_fail_hook has this cost. And it's docstring should probably highlight the wide variety of states the top_type instance might be in when an exception triggers post_fail_hook.


Regarding the possibility of having a SimEnvironment.post_fail_hook(), adding hooks to SimEnvironment would be at odds with the existing paradigm for extending SimEnvironment's behavior by subclassing with overriding methods.

As an aside, I chose to provide hooks for Component because there is a recursive nature to elaborate, post_simulate, and get_result that would make it some what tricky for users of Component to place their super method calls correctly. Using hooks abstracts away that complexity of users needing to know where/when/whether to call the super methods.

SimEnvironment, on the other hand, just has a single instance, so it is not such a big deal for a subclass to override its methods. For the problem at hand, overriding SimEnvironment.run() would be a reasonable thing for a subclass to do if it wanted to handle simulation-time errors in a special manner.

I wonder if it is sufficient for most models to attach resources that need cleanup to the SimEnvironment instance and wrap SimEnvironment.run() in a try/except block in a SubEnvironment.run() method?

child.post_fail()
self.post_sim_fail_hook()

def post_sim_fail_hook(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For naming, I would advocate on_exception() and exception_hook().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, replace post_sim_fail_hook() with on_exception() and post_fail() with on_exception()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post_sim_fail_hook() --> exception_hook()
post_fail() --> on_exception()

But this is naming is a secondary issue. The big issue is whether this paradigm is actually viable/appropriate.

@coveralls
Copy link

coveralls commented Jul 17, 2017

Coverage Status

Coverage increased (+2.5%) to 95.928% when pulling 3cac13b on itzbhushan:master into ec346a6 on SanDisk-Open-Source:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants