-
Notifications
You must be signed in to change notification settings - Fork 38
Feature/dns skip wait and partial state #1052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
PatrickKoss
wants to merge
28
commits into
stackitcloud:main
Choose a base branch
from
PatrickKoss:feature/dns-skip-wait
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,694
−35
Open
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
8712b73
add continue attribute for observability service alert config
PatrickKoss 7ef4213
adjust the acceptance test observability
PatrickKoss 702122b
adjust docs
PatrickKoss 56eb265
add continue in another case
PatrickKoss b405ce7
Merge branch 'main' into main
PatrickKoss 188f0b7
Merge branch 'stackitcloud:main' into main
PatrickKoss 13fdd53
remove continue attribute from root
PatrickKoss 113bbb9
fix acc test
PatrickKoss e073ec2
Merge branch 'stackitcloud:main' into main
PatrickKoss 8118f17
fix docs
PatrickKoss 3e1a403
fix unit tests
PatrickKoss 039719f
remove route types
PatrickKoss 6ffe516
Merge branch 'main' into main
rubenhoenle e74f9f8
Merge branch 'stackitcloud:main' into main
PatrickKoss 4e99f0d
Merge branch 'stackitcloud:main' into main
PatrickKoss 55183c5
Merge branch 'stackitcloud:main' into main
PatrickKoss ee3a0c8
add skip wait and set partial model
PatrickKoss 76fc503
fix linting errors
PatrickKoss de09817
revert formatting
PatrickKoss e7649c2
revert formatting
PatrickKoss 037cece
import state
PatrickKoss 265836f
downlint lint from releases + remove read id check
PatrickKoss ba8ecc8
Merge branch 'main' into feature/dns-skip-wait
PatrickKoss 1196efb
fix pipeline linting
PatrickKoss 50f1f37
adjust SetModelFieldsToNull to handle complex objects and lists
PatrickKoss 873f875
fix linting
PatrickKoss 6e89bf9
fix linting
PatrickKoss b769ba1
add dns wait warn log for tf idempotency
PatrickKoss File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| #!/usr/bin/env bash | ||
| set -e | ||
| . $(dirname ${0})/utility.sh | ||
|
|
||
| BINARY_NAME=golangci-lint | ||
| INSTALL_TO=${BIN_DIR}/${BINARY_NAME} | ||
|
|
||
| install() { | ||
| echo " installing ${BINARY_NAME} ${GOLANGCI_LINT_VERSION}" | ||
|
|
||
| TYPE=windows | ||
| if [[ "${OSTYPE}" == linux* ]]; then | ||
| TYPE=linux | ||
| elif [[ "${OSTYPE}" == darwin* ]]; then | ||
| TYPE=darwin | ||
| fi | ||
|
|
||
| case $(uname -m) in | ||
| arm64|aarch64) | ||
| ARCH=arm64 | ||
| ;; | ||
| *) | ||
| ARCH=amd64 | ||
| ;; | ||
| esac | ||
|
|
||
| BASE_URL=https://github.com/golangci/golangci-lint/releases/download/v${GOLANGCI_LINT_VERSION} | ||
| URL=${BASE_URL}/golangci-lint-${GOLANGCI_LINT_VERSION}-${TYPE}-${ARCH}.tar.gz | ||
| echo " Downloading: ${URL}" | ||
| download ${URL} | tar --extract --gzip --strip-components 1 --preserve-permissions -C ${BIN_DIR} -f- | ||
|
|
||
| # Ensure the binary has the correct name | ||
| if [ -f "${BIN_DIR}/golangci-lint" ] && [ "${BIN_DIR}/golangci-lint" != "${INSTALL_TO}" ]; then | ||
| mv "${BIN_DIR}/golangci-lint" "${INSTALL_TO}" | ||
| fi | ||
| } | ||
|
|
||
| get_version() { | ||
| ${INSTALL_TO} version 2>/dev/null | awk '{print $4}' | ||
| } | ||
|
|
||
| update_if_necessary ${GOLANGCI_LINT_VERSION} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,18 +1,19 @@ | ||
| #!/usr/bin/env bash | ||
| # This script lints the SDK modules and the internal examples | ||
| # Pre-requisites: golangci-lint | ||
| # Pre-requisites: golangci-lint (provided by Makefile or system) | ||
| set -eo pipefail | ||
|
|
||
| ROOT_DIR=$(git rev-parse --show-toplevel) | ||
| GOLANG_CI_YAML_PATH="${ROOT_DIR}/golang-ci.yaml" | ||
| GOLANG_CI_ARGS="--allow-parallel-runners --timeout=5m --config=${GOLANG_CI_YAML_PATH}" | ||
|
|
||
| if type -p golangci-lint >/dev/null; then | ||
| : | ||
| else | ||
| echo "golangci-lint not installed, unable to proceed." | ||
| # Use provided golangci-lint binary or fallback to system installation | ||
| GOLANGCI_LINT_BIN="${1:-golangci-lint}" | ||
|
|
||
| if [ ! -x "${GOLANGCI_LINT_BIN}" ] && ! type -p "${GOLANGCI_LINT_BIN}" >/dev/null; then | ||
| echo "golangci-lint not found at ${GOLANGCI_LINT_BIN} and not installed in PATH, unable to proceed." | ||
| exit 1 | ||
| fi | ||
|
|
||
| cd ${ROOT_DIR} | ||
| golangci-lint run ${GOLANG_CI_ARGS} | ||
| ${GOLANGCI_LINT_BIN} run ${GOLANG_CI_ARGS} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| #!/usr/bin/env bash | ||
| # Common utility functions for tool installation scripts | ||
|
|
||
| ROOT_DIR=$(git rev-parse --show-toplevel) | ||
| BIN_DIR="${ROOT_DIR}/bin" | ||
|
|
||
| # Ensure bin directory exists | ||
| mkdir -p "${BIN_DIR}" | ||
|
|
||
| # Download function using curl | ||
| download() { | ||
| local URL=$1 | ||
| if command -v curl &> /dev/null; then | ||
| curl -sSfL "${URL}" | ||
| elif command -v wget &> /dev/null; then | ||
| wget -qO- "${URL}" | ||
| else | ||
| echo "Error: Neither curl nor wget found. Please install one of them." | ||
| exit 1 | ||
| fi | ||
| } | ||
|
|
||
| # Update tool if necessary | ||
| update_if_necessary() { | ||
| local EXPECTED_VERSION=$1 | ||
|
|
||
| if [ -x "${INSTALL_TO}" ]; then | ||
| CURRENT_VERSION=$(get_version 2>/dev/null || echo "") | ||
| if [ "${CURRENT_VERSION}" = "${EXPECTED_VERSION}" ]; then | ||
| echo " ${BINARY_NAME} ${EXPECTED_VERSION} already installed" | ||
| return 0 | ||
| else | ||
| echo " ${BINARY_NAME} version mismatch (current: ${CURRENT_VERSION}, expected: ${EXPECTED_VERSION})" | ||
| echo " updating to ${EXPECTED_VERSION}..." | ||
| fi | ||
| fi | ||
|
|
||
| install | ||
|
|
||
| INSTALLED_VERSION=$(get_version 2>/dev/null || echo "unknown") | ||
| if [ "${INSTALLED_VERSION}" = "${EXPECTED_VERSION}" ]; then | ||
| echo " ${BINARY_NAME} ${EXPECTED_VERSION} installed successfully" | ||
| else | ||
| echo " Warning: installed version (${INSTALLED_VERSION}) does not match expected version (${EXPECTED_VERSION})" | ||
| fi | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but I just don't get why one would want to have this. What's the point of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be because of weird clients. Currently we only set project_id and zone_id in the state before waiting. This lead to the following error if the waiting is skipped which some clients want:
So I have set the id as well in the helper function. I think I observed an error in the past that the client got an error because some field in the state were "unknown". But I can no longer find this error message anymore. So currently I get:
Because of it the client wants to destroy the resource:
For some reason if you set the fields to null instead of unknown the client accepts it and proceeds correctly. Maybe we need to take a look together into the topic. If you have some better ways to handle this case feel free to suggest :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I don't mess things up here, what do you mean with
client?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crossplane+upjet that then executes terraform cli commands
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Messages like this are there for a reason by Terraform. You would break this behavior with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that is a problem with complex objects, list of lists and list of complex objects in the utils function SetModelFieldsToNull.
I also tried adding the same logic as in zone to iaas network and added alot of unit tests to provoke the error and couldn´t reproduce. You can check it here if you want.
Can you provide the input parameters so I can add unit tests for this case to verify if it happens in the implementation or not?
Additionally you can check with in your setup as well if the added functionality resolves the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or do you imply that it is perfectly fine to have errors? because if we want to use upjet to generate a crossplane provider we cannot accept such error since it simply does not work :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clear no.
Well, I guess it doesn't work because you modified the code of the terraform provider and didn't understand the impacts of your changes.
I have to start from scratch here: Unknown values are a core concept of Terraform (see https://developer.hashicorp.com/terraform/plugin/framework/handling-data/terraform-concepts#unknown-values). Unknown values are important for Terraform to apply resources in the correct order, ...
But what does this mean for us? After a
terraform applyrun which creates a new resource, all fields of the resource must be set by the Terraform provider to a value or to null explicitly. If this isn't done for a field of the resource, you will get a message like this:Whenever you get a message like this it's clear that this is a bug in the Terraform provider. And I'm going to lean myself out of the window here and say this doesn't happen for the
stackit_dns_record_setresource on the main branch of our STACKIT Terraform provider repository. 😄Let me explain why
We create the resource on API side and then use the wait handler.
terraform-provider-stackit/stackit/internal/services/dns/recordset/resource.go
Lines 215 to 235 in b5f82e7
After the wait handler we use the
mapFieldsfunction to map the API response to the Terraform state model.terraform-provider-stackit/stackit/internal/services/dns/recordset/resource.go
Lines 237 to 248 in b5f82e7
Now comes the important part: Here is the section in the
mapFieldsfunction, which makes sure all fields of the resource get set to a value or null. [1]terraform-provider-stackit/stackit/internal/services/dns/recordset/resource.go
Lines 432 to 445 in b5f82e7
Well, and after that the model struct must be persisted in the Terraform state (this doesn't happen automatically):
terraform-provider-stackit/stackit/internal/services/dns/recordset/resource.go
Lines 243 to 248 in b5f82e7
To sum it up, here's what happens in the main branch implementation of this resource:
mapFields)Now to your changes
Now to your changes and why it's not working (without setting all fields to null using your new reflection-powered util func):
In your
func (r *recordSetResource) Create(...) ...implementation...Createimplementation of the Terraform resource prematurely with the code below.The problem is: This doesn't only skip the wait handler (no. 3 above), but also the
mapFieldsfunc call (no. 4 above) which (as said) sets explicitly all values to a value or null.Again, you just skip this. This is a core part of the resource implementation. You don't call it. That's why Terraform complains about unknown values. Terraform says this is a bug in the provider implementation, and it's correct.
But it's sadly not a bug in our implementation on the main branch, but in your implementation.
You circumvent this problem by setting all fields of the Terraform resource state model explicitly to null by using your new util func. This circumvents the problem (Terraform doesn't complain anymore about unknown values), but it doesn't really fix the problem (at least not in a clean way).
In fact setting all fields of the Terraform resource model struct to null circumvents existing checks of Terraform which we want to take advantage of during our resource implementations (at least for pure Terraform usage, without thinking of crossplane here).
[1] Btw, if you forget to set one field of the Terraform resource model struct to a value of null here during the implementation of the Terraform resource you will also get exactly the error
After the apply operation, the provider still indicated an unknown value...from above. This is what I consider a terraform feature. As said, unknown values are a concept of TerraformThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation. It covers well my observations. I think we are actually on two sides of the same coin.
Let´s take a step and start with the requirements for the create again, then I share my observations during testing and then check different alternatives.
Requirements
Code Walkthrough, Testing and Observations
We already recognized that we need to set partial states in the terraform state. That´s why the following code already exists in the main branch:
https://github.com/PatrickKoss/terraform-provider-stackit/blob/main/stackit/internal/services/dns/recordset/resource.go#L222-L229
In my tests I have setup a terraform resource (in this case mariadb with the same function as mariadb takes way longer to create and dns is super fast. So please don´t be confused about the resource we are still talking about the same code)
Then I applied and once the wait handler started and I saw mariadb in creating state in the portal I canceled the apply to simulate random failures as mentioned above. Then I reapplied and got the error:
stackit_mariadb_instance.example_maria_db is tainted, so must be replaced.That´s when I recognized that setting ids is not enough and we need to include the fields in the resource as well (name, plane_name,version). So I changed the code to:
and that almost worked. We also should not log and error in the wait handler as it messes up terraform and result in non idempotent behaviour:
And that works perfectly fine in the case of create/cancel/reapply. Now there are no state drift and the resource stays as it is.
Now I went a step further and wrote unit tests for the behaviour. So we can really verify that it works how we think it works.
https://github.com/PatrickKoss/terraform-provider-stackit/blob/feature/cp-enhancements/stackit/internal/services/mariadb/instance/resource_create_test.go#L16-L157
The unit test covers the manual test create/cancel/read. Note that setting the partial state actually leads to null fields while reading the state again. Then I inserted the
utils.SetModelFieldsToNullinstead ofutils.SetAndLogStateFieldsand the test(s) were equally successful. This lead me to the assumption we are actually on two different sides of the same coin (different code but same outcome). Not setting fields in the state leads to null values while setting them to null explicitly also result in reading out null values. So we probably found out multiple ways to solve the idempotency problem. More in the alternatives.Second the early exit is this code:
Note this function is only executed if an environment variable is set to "true". If the variable is not set or to any other value than "true" we would continue with the wait handler. Not pretty but we somehow need to cover the requirement since the tool works as it works.
Alternatives/Conclusion
I think there is no real discussion about the early return but if there is feel free to suggest something.
The more interesting point is the idempotency part.
utils.SetModelFieldsToNullwe can iterate over the models attributes with reflection magic check for non null/unknown fields and use the tags (tfsdk) of the model as keys for the map and the value of the attribute of the model. This should result in the map we want to store as partial state in the terraform state.So, what do you think? Do you have other testing experiences? Which direction should we go?