Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Check Methods don't support custom error (any more) #1652

Open
2 of 3 tasks
harryjordancma opened this issue May 21, 2024 · 1 comment
Open
2 of 3 tasks

Custom Check Methods don't support custom error (any more) #1652

harryjordancma opened this issue May 21, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@harryjordancma
Copy link

harryjordancma commented May 21, 2024

Describe the bug
I'm trying to pass a custom error message to a custom check that I have however I get an error that the error message has multiple arguments. This was previous possible before this commit #1574.

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandera.
  • (optional) I have confirmed this bug exists on the main branch of pandera.

Code Sample, a copy-pastable example

import pandera as pa
import pandera.extensions as extensions
import pandas as pd



@extensions.register_check_method(statistics=["min_value", "max_value"])
def is_between(pandas_obj, *, min_value, max_value):
    return (min_value <= pandas_obj) & (pandas_obj <= max_value)


schema = pa.DataFrameSchema({
    "col": pa.Column(int, pa.Check.is_between(min_value=1, max_value=10, error="Value not in between", raise_warning =True))
})

data = pd.DataFrame({"col": [1, 5, 11]})
schema.validate(data)

This produces this error message

/home/ubuntu/miniconda/envs/py311/lib/python3.11/site-packages/pandera/backends/pandas/base.py:140: SchemaWarning: Column 'col' failed element-wise validator number 0: This is small. failure cases: 11
  warnings.warn(
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 14
      7 schema = pa.DataFrameSchema({
      8     "col": pa.Column(int, pa.Check.less_than(10, error="This is small.", raise_warning = True))
      9 })
     11 schema.validate(data)
     13 schema2 = pa.DataFrameSchema({
---> 14     "col": pa.Column(int, pa.Check.is_between(min_value=1, max_value=10, error="This is small, but positive.", raise_warning = True))
     15 })
     18 schema2.validate(data)

File ~/miniconda/envs/py311/lib/python3.11/site-packages/pandera/api/extensions.py:130, in register_check_statistics.<locals>.register_check_statistics_decorator.<locals>._wrapper(cls, *args, **kwargs)
    128     arg_names = statistics_args
    129 args_dict = {**dict(zip(arg_names, args)), **kwargs}
--> 130 check = class_method(cls, *args, **kwargs)
    131 check.statistics = {
    132     stat: args_dict.get(stat) for stat in statistics_args
    133 }
    134 check.statistics_args = statistics_args

File ~/miniconda/envs/py311/lib/python3.11/site-packages/pandera/api/extensions.py:294, in register_check_method.<locals>.register_check_wrapper.<locals>.check_method(cls, *args, **kwargs)
    291 error_stats = ", ".join(f"{k}={v}" for k, v in stats.items())
    292 error = f"{check_fn.__name__}({error_stats})" if stats else None
--> 294 return cls(
    295     partial(check_fn_wrapper, **stats),
    296     name=check_fn.__name__,
    297     error=error,
    298     **validate_check_kwargs(check_kwargs),
    299 )

TypeError: pandera.api.checks.Check() got multiple values for keyword argument 'error'

Expected behavior

Expected the check to raise the custom error message as this is the standard behavior in the built in checks https://pandera.readthedocs.io/en/stable/reference/generated/pandera.api.checks.Check.html

I'd really like to have this be the same with custom checks as I'd like the checks to be easier to understand.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Browser: N/A
  • Version: 0.19.3
@harryjordancma harryjordancma added the bug Something isn't working label May 21, 2024
@verwindle
Copy link

verwindle commented Jun 16, 2024

Reproduced the bug on every version since 0.19.0b2. This example works from 0.19.0b1 and below without any issues.
And 0.19.0b2 requires pyspark to be installed.

Explicit error=... at line 297 here (

error=error,
) is leading to this error. Keyword arguments error and check_kwargs overlap. The error parameter is computed inside check_method like its hardcoded. But as I see, it should fall to this value if only the error parameter is not present in kwargs. Otherwise, it can't be changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants