Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static typing support #1302

Open
wants to merge 58 commits into
base: master
Choose a base branch
from
Open

Add static typing support #1302

wants to merge 58 commits into from

Conversation

headtr1ck
Copy link

@headtr1ck headtr1ck commented Dec 15, 2023

Closes: #1298

Lots of this PR was stolen from #1279

Notice: I intentionally reversed the import behavior of __init__.py and _netCDF4.pyx in the stubs. This should be more in line in how people are using netCDF4. And while doing so I also removed the "._netCDF4" part of the classname in the reprs.

The current typing is working with python 3.11.
TODO:

  • Make typing pass with python 3.8 (aka. minimum supported version) Won't do

@CLAassistant
Copy link

CLAassistant commented Dec 15, 2023

CLA assistant check
All committers have signed the CLA.

@jswhit
Copy link
Collaborator

jswhit commented Dec 29, 2023

Thanks! Looks good - but why did you add a new workflow to run stubtest, instead of adding the extra test to the existing workflows?

@headtr1ck
Copy link
Author

why did you add a new workflow to run stubtest, instead of adding the extra test to the existing workflows?

Good question. I just took the changes from the previous PR #1279

After the holidays I can continue to work on this and use the existing workflows.

@jswhit
Copy link
Collaborator

jswhit commented Dec 29, 2023

Looks you like need to add git submodule update --init --recursive after the git clone in stubtest.yml.

@Woefie
Copy link

Woefie commented Jun 6, 2024

I was the author of #1279 , something went wrong with my pr and when I noticed this one was already made. Which is perfectly fine but now this pr has been dead for 5 months.
The only problem I see now with this pr is that mypy was not installed in the the workflow. Could this just be added and move on?
and merge latest version in this branch

@headtr1ck
Copy link
Author

Hey, I was just very busy the last months.
I can try to get it to work the next days.

And thanks for your initial work, it was very helpful to get started.

@headtr1ck
Copy link
Author

headtr1ck commented Jun 24, 2024

I think having the docstrings in two places sounds like a bad idea. This would require some CI checks to test if they are the same etc and would make future PRs complicated.

If the current docstrings are not shown in IDEs (I haven't checked that) then one could argue to simply move them over to the stub files all together, but probably in a future PR.

I guess it is up to the maintainers to decide such things.

@RandallPittmanOrSt
Copy link

RandallPittmanOrSt commented Jun 24, 2024

I think having the docstrings in two places sounds like a bad idea. This would require some CI checks to test if they are the same etc and would make future PRs complicated.

I agree completely that they should not be in two places in the netCDF4-python repo. That's why I settled on the idea that the script should be run by users post-install, and only if desired. Even that does present the risk that a user might not re-run the script on updating netCDF4 and have outdated docstrings in their IDE, but that seems fairly low-risk for the benefit (IMO).

If the current docstrings are not shown in IDEs (I haven't checked that) then one could argue to simply move them over to the stub files all together, but probably in a future PR.

I guess it is up to the maintainers to decide such things.

No, I don't think so, because then the docstrings would be lost from the runtime and, therefore, the REPL. As far as I can tell, the docstrings do belong in the source.

If, down the road, the API and Cython implementation were ever split, then all the doctrings and annotations could live in the API (.py file).

@headtr1ck
Copy link
Author

I would say, we focus on typing first in this PR and then open a separate issue for the docstrings.
Maybe there is a way to copy over the docstrings when building wheels, so at least there is only a single source?

@RandallPittmanOrSt
Copy link

RandallPittmanOrSt commented Jun 24, 2024

Variable type

I'd like to address the question of Variable types and making Variable generic in its
own comment.

I think there are some questions we need to try to answer:

  1. What is the benefit of making Variable generic?
    • We probably don't want to use the type in __getitem__() because we don't know if the
      return type will be an array or a scalar value--it depends on masking configuration
      of the Dataset and dimensions of the array and the key.
    • We probably don't want to use the type in getValue() because it could return either
      a scalar or a masked array.
    • It doesn't make sense to use the type in __setitem__ because __setitem__ is very
      generous with what types it accepts.
    • If we know the type then we can (at least it some/most cases?) annotate the
      .datatype and .dtype properties... with the caveat that propertly annotating
      @overload on properties is a bit of an unknown (see
      here and
      here)
  2. What are allowable types for Variable (What bound for VarT = TypeVar("VarT", bound=???)
    used for class Variable(Generic[VarT]): ..., if anything)?
    • CompoundType, VLType, and EnumType are not valid because only an instance of
      any of those types can be the type of a Variable.
    • Does it make more sense to use str or np.str_? Or neither, since it's stored as a
      VLType in the file? str is the type of returned scalars and class the VLType
      uses.
    • To me, (if we are specifying bound=) it does not make sense to allow any type
      literals ("f4", "S1", etc.) as those are not types in and of themselves.

I did some research and got this table of results:

createVariable datatype parameter Variable.datatype Variable.dtype Notes
Literal for a scalar numeric ("i4") except complex types np.dtype of that type np.dtype of that type Cannot infer statically without @overload for every single literal of that type
Literal "c8" or "c16" CompoundType instance np.dtype('complex64') or np.dtype('complex128') Cannot infer statically without @overload for every single literal of that type
numpy scalar numeric type (np.int8) or scalar numeric dtype (np.dtype(np.uint16)) except complex types np.dtype of that type np.dtype of that type Can infer statically with a generic @overload
numpy complex scalar numeric type (np.complex64) or scalar numeric dtype (np.dtype(np.complex64)) CompoundType instance np.dtype('complex64') or np.dtype('complex128') Can infer statically with a generic @overload. datatype overload must be separate
int np.dtype(int64) np.dtype('int64') Could add an @overload for this. Is np.dtype(int) == np.dtype('int64') on all platforms?
float np.dtype(float64) np.dtype('float64') Could add an @overload for this. Is np.dtype(float) == np.dtype('float64') on all platforms?
str or np.str_ or dtype(str) or dtype(np.str_) VLType instance with .name is None and .dtype == str str Could be provided by a specific @overload. I think the Variable type should be str.
"S1" or "c" np.dtype('S1') np.dtype('S1') Could maybe use np.bytes_ for the type??
"S2" or any other number, "U" or "U1" or U with any number, or np.dtype with those literals VLType instance with .name is None and .dtype == str str Undocumented. Don't include in stubs except allowing for in an @overload with Any
CompoundType instance CompoundType instance structured array dtype I don't think this can be annotated
EnumType instance EnumType instance dtype of the instance I don't think this can be annotated
VLType instance VLType instance dtype of the instance I don't think this can be annotated

The following cannot be used as datatype parameters:

  • np.bytes_
  • Other np.float or np.complex precisions.
  • bool or np.bool_
  • Strutured np.dtype -- e.g. st1 = np.dtype([("x", np.uint8), ("y", np.float32)]). A
    CompoundType must be created first.

Based on all of the above, I propose the following truncated implementation:
https://gist.github.com/RandallPittmanOrSt/ce695f0b7d7717493649909d1926314a

Edited to add: For better or worse, it turns out that the various Literals I proposed are of no use, unless we specifically annotate the @overload for each and every one. Nevermind, I think I can work around this.

Copy link

@RandallPittmanOrSt RandallPittmanOrSt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recognize that I have included a ton of stuff in this review. I've only reviewed the stubs so I don't know if/how any changes will affect tests etc. I'm going to be offline for the next few weeks so feel free to use or ignore whatever you want from this review.

src/netCDF4/__init__.pyi Outdated Show resolved Hide resolved
src/netCDF4/__init__.pyi Show resolved Hide resolved
src/netCDF4/__init__.pyi Outdated Show resolved Hide resolved
src/netCDF4/__init__.pyi Outdated Show resolved Hide resolved
src/netCDF4/__init__.pyi Show resolved Hide resolved
src/netCDF4/__init__.pyi Outdated Show resolved Hide resolved
src/netCDF4/__init__.pyi Outdated Show resolved Hide resolved
src/netCDF4/_netCDF4.pyi Show resolved Hide resolved
src/netCDF4/__init__.pyi Outdated Show resolved Hide resolved
src/netCDF4/__init__.pyi Outdated Show resolved Hide resolved
@RandallPittmanOrSt
Copy link

RandallPittmanOrSt commented Jun 24, 2024

I will add, I didn't have any problems with my stub file working under Python 3.8, in fact I developed it under 3.8. I did have to get Buffer, Self, and TypeAlias from typing_extensions as you did, but otherwise followed the docs and had no issues: https://typing.readthedocs.io/en/latest/source/stubs.html. I think yours should work fine under 3.8.

@jswhit
Copy link
Collaborator

jswhit commented Jun 25, 2024

@jswhit any reason, __has_parallel_support__ and __has_ncfilter__ were not imported at netCDF4 level (only available in netCDF4._netCDF4? I imported them now, but can undo that.

No reason, just an oversight on my part. OK to fix.

@headtr1ck
Copy link
Author

I would say I leave it like this. The Variable proposal with all the Descriptors looks quite hacky and is beyond my knowledge.
I propose to merge as is, and @RandallPittmanOrSt can open a follow-up PR to improve things.

@jswhit
Copy link
Collaborator

jswhit commented Jun 27, 2024

stubtest is failing

@headtr1ck
Copy link
Author

stubtest is failing

Can you try again please?

@jswhit
Copy link
Collaborator

jswhit commented Jun 27, 2024

stubtest is failing

Can you try again please?

still failing

src/netCDF4/_netCDF4.pyx Show resolved Hide resolved
docs/index.html Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add static type hints
6 participants