Skip to content

Conversation

link2xt
Copy link

@link2xt link2xt commented Jun 23, 2019

Otherwise value_iteration is allowed to return correct value function but incorrect policy

Otherwise value_iteration is allowed to return correct value function but incorrect policy
@link2xt
Copy link
Author

link2xt commented Jun 23, 2019

Note that I evaluate returned policy and compare its value, because there are multiple optimal policies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant