Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale resource due to terraform provider upgrade #332

Open
mihdih opened this issue Nov 21, 2023 · 4 comments
Open

Stale resource due to terraform provider upgrade #332

mihdih opened this issue Nov 21, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@mihdih
Copy link

mihdih commented Nov 21, 2023

Hello!

First off, for context:

We've been using terraform in adding the ACLs to our Confluent cloud cluster. This has been working well until just recently when we hit some frequent rate limiting issues, similar to the one described in issue 148. Up until then, we were using the confluentinc/confluent provider version 1.36.0. As suggested, to resolve the issue we upgraded the provider to version 1.47.0 or higher. And so we did, we upgraded to latest version which is 1.55.0

The issue:

After the upgrade, it seems that records or resources that were added using the old version are not removed properly. I could perform some more tests, but in particular, the resource below are having this problem.

# Added using confluentinc/confluent version 1.36.0

resource "confluent_kafka_acl" "acl" {
  kafka_cluster {
    id = confluent_kafka_cluster.dedicated-cluster.id
  }

  # Set to '*' as this is for Confluent Cloud.
  host = "*"

  resource_type = "TOPIC"
  resource_name = "*"
  pattern_type  = "LITERAL"
  principal     = "User:<service_account_id>"
  operation     = "ALL"
  permission    = "ALLOW"
  rest_endpoint = confluent_kafka_cluster.dedicated-cluster.rest_endpoint

  credentials {
    key    = confluent_api_key.app-manager-kafka-api-key.id
    secret = confluent_api_key.app-manager-kafka-api-key.secret
  }

  depends_on = [confluent_api_key.app-manager-kafka-api-key]
}
# resource_name is changed in confluentinc/confluent version 1.55.0 to 'foobar' from '*'

resource "confluent_kafka_acl" "acl" {
  kafka_cluster {
    id = confluent_kafka_cluster.dedicated-cluster.id
  }

  # Set to '*' as this is for Confluent Cloud.
  host = "*"

  resource_type = "TOPIC"
  resource_name = "foobar"
  pattern_type  = "LITERAL"
  principal     = "User:<service_account_id>"
  operation     = "ALL"
  permission    = "ALLOW"
  rest_endpoint = confluent_kafka_cluster.dedicated-cluster.rest_endpoint

  credentials {
    key    = confluent_api_key.app-manager-kafka-api-key.id
    secret = confluent_api_key.app-manager-kafka-api-key.secret
  }

  depends_on = [confluent_api_key.app-manager-kafka-api-key]
}

Result

The ACL with resource_name foobar has been added but the old one (resource_name *) is still present, despite the terraform stating that it had been destroyed.

Am I missing something obvious here? Should I not upgrade straight to latest version right away after 1.36.0?

@linouk23
Copy link
Collaborator

linouk23 commented Nov 21, 2023

Thanks for creating this issue @mihdih!

Should I not upgrade straight to latest version right away after 1.36.0?

That definitely should be fine.

We didn't change much on client but there was a non-breaking API update.

Just to confirm my understanding, are you saying that after updating the value for resource_name, Terraform attempted to recreate the resource (by deleting the old one and creating a new one), and while the creation was successful, the deletion was a no-op (Terraform did not show any errors, but the ACL was not removed from the backend)?

@mihdih
Copy link
Author

mihdih commented Nov 21, 2023

Thank you for a swift response @linouk23 .

Just to confirm my understanding, are you saying that after updating the value for resource_name, Terraform attempted to recreate the resource (by deleting the old one and creating a new one), and while the creation was successful, the deletion was a no-op (Terraform did not show any errors, but the ACL was not removed from the backend)?

That is correct. The delete + create is expected. There was no error upon the deletion, in fact terraform reported back that it was destroyed. I can confirm there after that the ACL (old) still exists. I have checked in both Confluent Cloud UI and via CLI.

@mihdih
Copy link
Author

mihdih commented Nov 22, 2023

Hi @linouk23

I did a little bit of investigation and looks like the problem is when passing the User or principal.

In the old version the account id (integer) is used, while in the new one it's the Service Account resource Id. Please check the sample curl queries below.

Working (replicating the call used in version 1.36.0)

curl -u {api_key} --request DELETE --url https://{confluent_kafka_cluster.dedicated-cluster.rest_endpoint}/kafka/v3/clusters/{confluent_kafka_cluster.dedicated-cluster.id}/acls?host=%2A&operation=ALL&pattern_type=LITERAL&permission=ALLOW&principal=User%3A{service_account_id}resource_name=%2A&resource_type=TOPIC

# Where service_account_id is something like 1978961

NOT Working (replicating the call used in version 1.55.0)

curl -u {api_key} --request DELETE --url https://{confluent_kafka_cluster.dedicated-cluster.rest_endpoint}/kafka/v3/clusters/{confluent_kafka_cluster.dedicated-cluster.id}/acls?host=%2A&operation=ALL&pattern_type=LITERAL&permission=ALLOW&principal=User%3A{service_account_resource_id}resource_name=%2A&resource_type=TOPIC

# Where service_account_resource_id is something like sa-xxxxx

I appreciate that the change is deliberate basing on this commit change, which makes me wonder if the bug is in the REST Api?

Another theory is perhaps the new api call won't work on old provisioned clusters?

@linouk23 linouk23 added the bug Something isn't working label Nov 23, 2023
@linouk23
Copy link
Collaborator

Thank you for performing this investigation!

That was my thinking as well and I’ll reach out to the backend team 🫡

Lucatronlk pushed a commit to Lucatronlk/terraform-provider-confluent that referenced this issue Dec 28, 2023
* Scaffolding

* Implement Read

* Implement Create

* Implement Delete

* Refactor Read to support Import

* Implement Import

* Fix typos

* Tests scaffolding

* Azure tests

* Add force new

* Drop Required for ID

* Set Id as computed

* Drop ValidateFunc for ID

* Enforce Required for key

* Add Optional argument to AWS and Azure keys

* Add Elem for roles set

* Add key block to byok resource

* Add resource name

* Remove reference to bucket service constants

* Formatting

* Update azure parameter names

* Disable BYOK test

* Add Azure tests

* Initialise BYOK client

* Address issues regarding set conversion

* Fix azure tests

* Enforce ForceNew

* terraform fix test

* Add AWS tests and Refactor Azure ones

* Address PR comments

* Remove ForceNew on AWS roles

* Add reference to external sdk

* Revert module version updates

* Use BYOK auth

* add data source

* Create data_source_byok_key_azure_test.go

* Address PR comments

* Address PR comments

* added docs

* Update resource_byok_key.go

* Update data_source_byok_key.go

* Update confluent_byok_key.md

* update sdk to the publish one

* Update go.sum

---------

Co-authored-by: Sarguru Mohan <[email protected]>
Co-authored-by: Zhen Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants