Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong Merge Key Not Throwing Error #1463

Open
zem360 opened this issue Jun 14, 2024 · 1 comment · May be fixed by #1504
Open

Wrong Merge Key Not Throwing Error #1463

zem360 opened this issue Jun 14, 2024 · 1 comment · May be fixed by #1504
Assignees
Labels
bug Something isn't working

Comments

@zem360
Copy link
Contributor

zem360 commented Jun 14, 2024

dlt version

0.4.12

Describe the problem

While working on a community support request regarding snake_case, camel_case @dat-a-man discovered this bug.

A typo in the merge_key or the wrong merge_key doesn't throw an error.

Expected behavior

If there is a typo in the merge_key or the provided key is not in the data an error should be thrown, but the code is running normally.

In the code snippet provided in Steps to Reproduce the merge_key = 'mana' should throw an error as it is not present in the data, but that is not the case.

Steps to reproduce

Run the following code:

@dlt.resource(name='table_name', merge_key = "mana", write_disposition={"disposition": "merge"})
def func():
    data = [{'id':1, 'NAme':'abcaaaa', 'status':'bronze'},
            {'id':2,'NAme':'deaf','status':'bronze'}]
    yield data

Operating system

macOS

Runtime environment

Local

Python version

3.10

dlt data source

No response

dlt destination

No response

Other deployment details

No response

Additional information

No response

@zem360 zem360 added the bug Something isn't working label Jun 14, 2024
@zem360 zem360 changed the title Wrong Merge Key not Throwing Error Wrong Merge Key Not Throwing Error Jun 14, 2024
@rudolfix
Copy link
Collaborator

the reason is that we use merge_key only during the loading. so before that no checks are done. same thing will happen to any other hint including primary_key (if not part of incremental which is actually checking the data).

to really fix this issue (before loading starts) we'd need to track which columns received data, currently we track only table level. maybe we can do that in a separate ticket.

what we can do now:

  • there's a warning in the code somewhere that warns the user if there are any columns that do not have a data type (so we know they've never seen data). maybe it is gone? maybe we should warn the user at the end of normalization stage?
  • let's verify why the code is working normally. merge keys are NOT NULL. probably columns are skipped when syncing to database. how to fix: escalate the warning to exception for all incomplete columns that are NOT NULL so we fail at normalize

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Planned
Development

Successfully merging a pull request may close this issue.

3 participants