fix: OnDemandFeatureView type inference for array types #4310

alexmirrington · 2024-06-24T09:41:15Z

What and Why

OnDemandFeatureView.feature_transformation.infer_features should be able to infer features from python types for all supported feast data types, for all transformation backends. This PR passes values to python_type_to_feast_value_type so that list types can be inferred correctly.

Tests

Unit tests passing in CI for all Python versions on my fork.
Tested against a local project by patching PandasTransformation.infer_features, e.g.

from feast.transformation.pandas_transformation import PandasTransformation

def __patched_infer_features(
    self, random_input: dict[str, list[Any]]
) -> list[Field]:
    df = pd.DataFrame.from_dict(random_input)
    output_df: pd.DataFrame = self.transform(df)
    return [
        Field(
            name=f,
            dtype=from_value_type(
                python_type_to_feast_value_type(
                    f, value=output_df[f].tolist()[0], type_name=str(dt)
                )
            ),
        )
        for f, dt in zip(output_df.columns, output_df.dtypes)
    ]

PandasTransformation.infer_features = __patched_infer_features

Note to reviewers
If you can point me to where I could add some feature repository tests to catch any regressions in the future that would be great 🙏

Fixes

Fixes #4308

tokoko · 2024-06-24T13:26:47Z

I think the best way for this to be tested would be in respective unit tests: test_on_demand_pandas_transformation.py, test_on_demand_python_transformation.py and test_substrait_transformation.py

Signed-off-by: Alex Mirrington <[email protected]>

alexmirrington · 2024-06-25T06:19:11Z

@tokoko Added some tests for lists of bools, strings, floats and ints for both python and pandas ODFVs, as well as a request source in there. LMK what you think!

HaoXuAI · 2024-06-26T22:18:58Z

sdk/python/feast/type_map.py

@@ -155,6 +155,7 @@ def python_type_to_feast_value_type(
        "uint16": ValueType.INT32,
        "uint8": ValueType.INT32,
        "int8": ValueType.INT32,
+        "bool_": ValueType.BOOL,


what is this type bool_?

It's a numpy bool type, but harder to identify since we use type(value).__name__, e.g.

>>> import pandas as pd >>> data = pd.DataFrame({"a": [True, False]}) >>> type(data["a"].iloc[0]) <class 'numpy.bool_'> >>> type(data["a"].iloc[0]).__name__ 'bool_'

We need this mapping for boolean features in pandas transformations.

gotcha. add a simple doc would be better.

HaoXuAI · 2024-06-26T22:30:43Z

sdk/python/feast/transformation/python_transformation.py

@@ -44,7 +44,9 @@ def infer_features(self, random_input: dict[str, list[Any]]) -> list[Field]:
            Field(
                name=f,
                dtype=from_value_type(
-                    python_type_to_feast_value_type(f, type_name=type(dt[0]).__name__)
+                    python_type_to_feast_value_type(
+                        f, value=dt[0], type_name=type(dt[0]).__name__


Could the list be empty and index out of range?

It's certainly possible if the transformation written by the user doesn't return a list of length 1 - but I think under the assumption that the transformation is written correctly, then we wouldn't expect index errors. Perhaps we could add some validation checks in here with some nice error messages for users, instead of just bubbling up IndexErrors during UDF development?

validation sounds good to me

alexmirrington · 2024-06-27T23:14:49Z

@HaoXuAI FYI seeing a few rate limits for bigtable and redshift in integration tests, doesn't seem related to my changes but maybe you know who best could take a look?

tokoko · 2024-06-28T06:44:11Z

sdk/python/tests/unit/test_on_demand_pandas_transformation.py

+        assert type(result["acc_rate"]) == float
+        assert type(result["avg_daily_trips"]) == int
+        # On-demand view
+        assert type(result["avg_daily_trips_plus_one"]) == int


Do we need these type asserts? The fact that you provided a schema above should be enough, as feast itself checks that inferred and provided schemas match? wdyt?

Happy to remove them if you think they are cluttering tests

No, I suppose checking return values can also matter in some cases

tokoko · 2024-06-28T08:53:21Z

@alexmirrington Can you make a dummy commit to force a rerun?
@HaoXuAI Some tests still need First-time contributor approval

tokoko

LGTM

alexmirrington mentioned this pull request Jun 24, 2024

OnDemandFeatureView.feature_transformation.infer_features does pass UDF outputs to python_type_to_feast_value_type #4308

Open

alexmirrington changed the title ~~Fix OnDemandFeatureView type inference for array types~~ fix: OnDemandFeatureView type inference for array types Jun 24, 2024

alexmirrington force-pushed the fix-odfv-type-inference branch from feac39c to 6b7afb9 Compare June 24, 2024 10:45

alexmirrington marked this pull request as ready for review June 24, 2024 10:49

alexmirrington force-pushed the fix-odfv-type-inference branch 2 times, most recently from 3defb2a to 9594997 Compare June 25, 2024 05:19

Fix OnDemandFeatureView type inference for array types

62208ea

Signed-off-by: Alex Mirrington <[email protected]>

alexmirrington force-pushed the fix-odfv-type-inference branch from 9594997 to 62208ea Compare June 25, 2024 06:18

HaoXuAI reviewed Jun 26, 2024

View reviewed changes

HaoXuAI added the ok-to-test label Jun 26, 2024

HaoXuAI reviewed Jun 26, 2024

View reviewed changes

tokoko reviewed Jun 28, 2024

View reviewed changes

tokoko approved these changes Jun 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: OnDemandFeatureView type inference for array types #4310

fix: OnDemandFeatureView type inference for array types #4310

alexmirrington commented Jun 24, 2024 •

edited

Loading

tokoko commented Jun 24, 2024

alexmirrington commented Jun 25, 2024

HaoXuAI Jun 26, 2024

alexmirrington Jun 27, 2024

HaoXuAI Jun 27, 2024

HaoXuAI Jun 26, 2024

alexmirrington Jun 27, 2024

HaoXuAI Jun 27, 2024

alexmirrington commented Jun 27, 2024 •

edited

Loading

tokoko Jun 28, 2024

alexmirrington Jun 28, 2024

tokoko Jun 28, 2024

tokoko commented Jun 28, 2024

tokoko left a comment

fix: OnDemandFeatureView type inference for array types #4310

Are you sure you want to change the base?

fix: OnDemandFeatureView type inference for array types #4310

Conversation

alexmirrington commented Jun 24, 2024 • edited Loading

What and Why

Tests

Fixes

tokoko commented Jun 24, 2024

alexmirrington commented Jun 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexmirrington commented Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tokoko commented Jun 28, 2024

tokoko left a comment

Choose a reason for hiding this comment

alexmirrington commented Jun 24, 2024 •

edited

Loading

alexmirrington commented Jun 27, 2024 •

edited

Loading