Skip to content

Conversation

@nikos912000
Copy link
Contributor

@nikos912000 nikos912000 commented Mar 17, 2022

What does this PR do?

  • Adds new functionality
  • Alters existing functionality
  • Fixes a bug
  • Improves documentation or testing

Makes the end-to-end tests more generic so they can be executed in non-minikube clusters.

Code Quality Checklist

  • The documentation is up to date.
  • My code is sufficiently commented and passes continuous integration checks.
  • I have signed my commit (see Contributing Docs).

Testing

  • I leveraged continuous integration testing
    • by depending on existing unit tests or end-to-end tests.

@nikos912000 nikos912000 requested a review from a team as a code owner March 17, 2022 21:59

instanceKey = types.NamespacedName{Name: "foo", Namespace: "default"}
// wait for the cache to sync
time.Sleep(10 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What cache are we talking about here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was running these tests vs a remote cluster and I was getting an error message along the lines of:

the cache is not started, can not read objects

Here we create the Kubernetes client. This will watch for and cache objects. I'm assuming in remote clusters this takes some time.

I can drop this tbh as it is not an issue on minikube and may have to do with our clusters. But it took me a while to debug!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you recall on which test you were getting this error? I would prefer to have a check in a Eventually clause somewhere in the tests setup so we do not blindly wait for 10s but start when we can start reading objects (if that is possible of course).
We will soon run those tests both against minikube and remote clusters too so we'll likely face the same issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember, sorry... It was at the very beginning though, either as part of the setup or when deploying the first CR.
I implemented this as a temporary workaround but I imagine we could use the waitForCacheSync method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've reproduced this.

This is because we have more logic in our tests; we run tests for network Availability Zone failures. For this we need to check in which Node/Availability Zone the controller gets deployed at so we do not target that one. We achieve this through Affinity rules on the test Pods.

The error message is the following:

Unexpected error:
      <*fmt.wrapError | 0xc0008ec540>: {
          msg: "can't list controller pods: the cache is not started, can not read objects",
          err: <*cache.ErrCacheNotStarted | 0x323dbb0>{},
      }
      can't list controller pods: the cache is not started, can not read objects
  occurred

I have a feeling this was happening even before adding that logic though. Ultimately, in any calls where we are listing Resources we need to make sure the cache is ready.

I've pushed an update to the PR. I hope that makes more sense.

@nikos912000
Copy link
Contributor Author

@Devatoria do you think these changes would be valuable? I'm happy to close it if not.

@ptnapoleon ptnapoleon requested a review from a team May 13, 2022 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants