[engine/import] Guess references to properties between dependant resources during import #16234

Zaid-Ajaj · 2024-05-20T20:31:55Z

Description

Following up and extending #16208 such that now we guess references to properties between dependant resources when running an import based on their literal data retrieved from providers. See unit test added that showcases inferred dependencies when referencing deeply nested data from maps and arrays

Checklist

I have run make tidy to update any new dependencies
I have run make lint to verify my code passes the lint check
- I have formatted my code using gofumpt

I have added tests that prove my fix is effective or that my feature works

I have run make changelog and committed the changelog/pending/<file> documenting my change

Yes, there are changes in this PR that warrants bumping the Pulumi Cloud API version

pulumi-bot · 2024-05-20T20:32:36Z

Changelog

[uncommitted] (2024-06-25)

Features

[engine] Guess references to (deeply nested) properties between dependant resources during import
#16234

Frassle · 2024-05-22T15:55:58Z

pkg/importer/language.go

 	}

 	for _, pathedValue := range pathedValues {
-		if occurrences[pathedValue.Value] > 1 {
+		if pathedValue.Identity && occurrences[pathedValue.Value] > 1 {


Why is this just being checked for Identity values? Isn't this a problem for any value? If two resources both have an output "hello" and another resource has an input "hello" you can't decided which of the two resources to draw from.

So the reason for this because if you have for example these pathed values computed

resourceGroup.id="group123" resourceA.resourceGroup = "group123" resourceB.resourceGroup = "group123"

Then I wanted to maintain the path resourceGroup.id even if that value is duplicated in paths resourceA.resourceGroup and resourceB.resourceGroup.

However, I think when building the pathed values, I should only be using state.Outputs instead of state.Inputs and only replace literal inputs with paths of resource outputs. Even then, resourceA.resourceGroup and resourceB.resourceGroup will occur both on state.Inputs and state.Outputs AFAIK

Would be great if I can actually test this with the CLI but getting weird behaviour with pulumi import

Maybe I can actually do a second pass where we remove duplicate paths without taking identities into account and then merge non-duped identities and non-duped values 🤔

Maybe I can actually do a second pass where we remove duplicate paths without taking identities into account and then merge non-duped identities and non-duped values 🤔

Implemented ☝️

Also found an edge case of "overguessing" that could result in a circular reference:

const bucket = new aws.s3.Bucket("my-bucket", { website: { indexDocument: "index.html", }, }); const bucketObject = new aws.s3.BucketObject("index.html", { bucket: bucket.id });

Implemented a fallback such that if we encounter this error, we retry generating the program without guessing references.

This has been resolved with the ancestorTypes feature where you can specify that BucketObject can guess values from outputs of Bucket but not the other way around ✅

pkg/importer/language.go

justinvp

Some initial comments

justinvp · 2024-06-05T07:07:30Z

pkg/importer/language.go

+			//    const bucketObject = new aws.s3.BucketObject("index.html", {
+			//        bucket: bucket.id
+			//    });
+			// fallback to the old code path where we don't guess references


Is there a test case that covers this?

I will add one!

justinvp · 2024-06-05T07:10:26Z

pkg/importer/language.go

+		return pathedLiteralValues
+	}
+
+	if property.IsString() {


What about non-string values, such as IsNumber()?

Also, what about more complex values, like arrays and objects that are deeply equal? e.g. a resource has an output that's an array of strings and another resource has the same array of strings as an input?

What about non-string values, such as IsNumber()?

I didn't account for numeric values, nor booleans because I am afraid of overguessing dependencies that have nothing to do with each other. For example instanceSize: 2 on a resource A and retryCount: 2 on a resource B would infer a dependency that is not necessarily what users wanted. Felt like starting with strings would make the most sense since AFAIK all IDs are strings and references between resources use a lot of them (i.e. resourceGroup.name in azure-native, vpc.id in aws, etc.)

Also, what about more complex values, like arrays and objects that are deeply equal? e.g. a resource has an output that's an array of strings and another resource has the same array of strings as an input?

First objects: for these we do not need to check for deep equality because their elements will be references. For example, consider the data here:

resourceA { // outputs config { first = "A" second = "B" } } resourceB { // inputs config { first = "A" second = "B" } }

We could infer that resourceB.config = resourceA.config but chances are these are not the same type in the SDK so this might not work. That said, this is not needed in the first place because the current logic would infer dependency by values:

resourceB { config { first = resourceA.config.first second = resourceB.config.second } }

This version is better handled by program-gen implementations.

As for arrays, I am not sure. The current logic would index each element where it is referenced:

resourceA { // outputs data = ["first", "second"] } resourceB { // inputs data = ["first", "second"] }

Inferred:

resourceB { // inputs data = [resourceA.data[0], resourceA.data[1]] }

so it will work even though it looks a bit ugly

justinvp · 2024-06-05T07:16:22Z

pkg/importer/hcl2_test.go

@@ -506,6 +506,133 @@ func TestGenerateHCL2DefinitionsDoesNotMakeSelfReferences(t *testing.T) {
 	assert.Equal(t, expectedCode, hcl2Text.String(), "Generated HCL2 code does not match expected code")
 }

+func TestGenerateHCL2DefinitionsReplacesDeeplyNestedReferencesToLiterals(t *testing.T) {


Can we have some test cases where there are multiple resources with the same output value, and therefore shouldn't be used as an input reference anywhere?

TestGenerateHCL2DefinitionsWithAmbiguousReferencesMaintainsLiteralValue asserts just that but with IDs which are treated like outputs.

…urces during import

Zaid-Ajaj added the area/import label May 20, 2024

Zaid-Ajaj requested a review from a team as a code owner May 20, 2024 20:31

Frassle reviewed May 22, 2024

View reviewed changes

Zaid-Ajaj force-pushed the zaid/guess-deep-property-references-import branch from c5e143b to 85930f7 Compare May 24, 2024 21:00

Zaid-Ajaj requested a review from Frassle May 24, 2024 21:03

Frassle reviewed May 29, 2024

View reviewed changes

pkg/importer/language.go Outdated Show resolved Hide resolved

Zaid-Ajaj force-pushed the zaid/guess-deep-property-references-import branch from 2b710a5 to b89b7e1 Compare May 29, 2024 14:49

justinvp mentioned this pull request Jun 3, 2024

[Epic] Import improvements #15938

Open

9 tasks

justinvp reviewed Jun 5, 2024

View reviewed changes

Guess references to (deeply nested) properties between dependant reso…

866654c

…urces during import

Zaid-Ajaj force-pushed the zaid/guess-deep-property-references-import branch from 2a915b4 to 866654c Compare June 5, 2024 15:05

Zaid-Ajaj requested a review from justinvp June 5, 2024 15:06

Zaid-Ajaj added 3 commits June 15, 2024 23:30

Implement AncestorTypes hint in import file specification

84e3e09

lint and remove redundant code

3ce03f0

ancestorTypes docs in pulumi import --help

5866e46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[engine/import] Guess references to properties between dependant resources during import #16234

[engine/import] Guess references to properties between dependant resources during import #16234

Zaid-Ajaj commented May 20, 2024

pulumi-bot commented May 20, 2024 •

edited

Loading

Frassle May 22, 2024

Zaid-Ajaj May 23, 2024

Zaid-Ajaj May 23, 2024

Zaid-Ajaj May 24, 2024

Zaid-Ajaj Jun 25, 2024

justinvp left a comment

justinvp Jun 5, 2024

Zaid-Ajaj Jun 5, 2024

Zaid-Ajaj Jun 5, 2024

justinvp Jun 5, 2024

justinvp Jun 5, 2024

Zaid-Ajaj Jun 5, 2024

justinvp Jun 5, 2024

Zaid-Ajaj Jun 5, 2024

[engine/import] Guess references to properties between dependant resources during import #16234

Are you sure you want to change the base?

[engine/import] Guess references to properties between dependant resources during import #16234

Conversation

Zaid-Ajaj commented May 20, 2024

Description

Checklist

pulumi-bot commented May 20, 2024 • edited Loading

Changelog

[uncommitted] (2024-06-25)

Features

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinvp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pulumi-bot commented May 20, 2024 •

edited

Loading