Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong JSON output from SchemaErrors.message #1669

Open
2 of 3 tasks
Twomasz opened this issue Jun 1, 2024 · 0 comments
Open
2 of 3 tasks

Wrong JSON output from SchemaErrors.message #1669

Twomasz opened this issue Jun 1, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Twomasz
Copy link

Twomasz commented Jun 1, 2024

Description

The column field does not show the column name itself. Instead, it duplicates the schema name.

So far I have only seen this problem in errors under the SCHEMA key in JSON output message. All errors under the DATA key have the column names displayed correctly.

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandera.
  • (optional) I have confirmed this bug exists on the main branch of pandera.

Code Sample, a copy-pastable example

df = pd.DataFrame({
    'defined_in_schema': [1, 2],
    'undefined_in_schema': [3, 4],
})

schema = pa.DataFrameSchema(
    columns={
        "defined_in_schema": pa.Column(dtype="int64", name="defined_in_schema"),
    },
    name='MySchemaName',
    strict=True,
)


try:
    df_val = schema.validate(df, lazy=True)
except pa.errors.SchemaErrors as exc:
    print(exc)

Output in the console

{
  "SCHEMA": {
    "COLUMN_NOT_IN_SCHEMA": [
      {
        "schema": "MySchemaName",
        "column": "MySchemaName",
        "check": "column_in_schema",
        "error": "column 'undefined_in_schema' not in DataFrameSchema {'defined_in_schema': <Schema Column(name=defined_in_schema, type=DataType(int64))>}"
      }
    ]
  }
}

Expected behavior

The output of this code should contains "column": "undefined_in_schema" instead of "column": "MySchemaName"

Desktop (please complete the following information):

  • OS: [Windows 10]
  • Version of Pandera: 0.19.3
  • Version of Python: 3.12.3
@Twomasz Twomasz added the bug Something isn't working label Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant