-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Numpy 2.x #429
base: main
Are you sure you want to change the base?
Conversation
By default, the extension is compile with support for numpy 1 and 2 (with runtime checks to pick the right binary offset where needed). Features or fields that are specific to a version are hidden by default. Users can opt-out of numpy 1 + numpy 2 by disabling default features and selecting a version. The library panics if the runtime version does not match the compilation version if only one version is selected.
Cargo.toml
Outdated
@@ -31,3 +31,8 @@ nalgebra = { version = "0.32", default-features = false, features = ["std"] } | |||
|
|||
[package.metadata.docs.rs] | |||
all-features = true | |||
|
|||
[features] | |||
default = ["numpy-1", "numpy-2"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the cost to enabling a "support level" if it is never used? Since we have to do the check in any case, do the extra features really carry their weight?
src/npyffi/mod.rs
Outdated
@@ -10,13 +10,21 @@ | |||
)] | |||
|
|||
use std::mem::forget; | |||
use std::os::raw::c_uint; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge this into the line below.
src/npyffi/objects.rs
Outdated
#[cfg(feature = "numpy-2")] | ||
#[allow(non_snake_case)] | ||
#[inline(always)] | ||
pub fn PyDataType_ISLEGACY(dtype: *const PyArray_Descr) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This functions and all the similar ones below must be unsafe fn
as they have no way of validating the pointer argument, i.e. the caller has to ensure a valid pointer is passed.
src/npyffi/mod.rs
Outdated
|
||
pub const NPY_2_0_API_VERSION: c_uint = 0x00000012; | ||
|
||
pub static ABI_API_VERSIONS: std::sync::OnceLock<(c_uint, c_uint)> = std::sync::OnceLock::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think GILOnceCell
is applicable here as we can just require that the accessor functions like PyDataType_FLAGS
take a py: Python
token. (We define them here so there is not need to conform exactly to the C signature just as there is no guarantee that they will stay in sync with the definitions in NumPy's C headers.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides accessor functions, the version is also needed in the API functions that have a different offset in Numpy 1 and 2 (for instance PyArray_CopyInto
) or only exist in Numpy 1 or 2 (using them with the wrong runtime version would otherwise result in memory corruption or a segfault).
I'm happy to switch to GILOnceCell
but just wanted to check that changing higher-level function signatures was ok. For instance,
Line 259 in 2170e16
pub fn flags(&self) -> c_char { |
py: Python
so that it can call PyDataType_FLAGS
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have GIL ref &'py PyArray<..>
or a bound ref Bound<'py, PyArray<..>>
, this implies access to a GIL token via e.g. Bound::py
. So I don't think this is an issue.
In any case, this will be a breaking release so we can change the API where necessary.
src/npyffi/objects.rs
Outdated
} | ||
|
||
#[cfg(all(feature = "numpy-1", feature = "numpy-2"))] | ||
macro_rules! DESCR_ACCESSOR { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
macro_rules! DESCR_ACCESSOR { | |
macro_rules! define_descr_accessor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! From looking at the code, I think we can take a simpler approach and always build with support for both NumPy 1.x and 2.x.
I also think GILOnceCell
is appropriate to handle the version information.
src/npyffi/array.rs
Outdated
#[cfg(all(feature = "numpy-1", not(feature = "numpy-2")))] | ||
impl_api![50; PyArray_CastTo(out: *mut PyArrayObject, mp: *mut PyArrayObject) -> c_int]; | ||
#[cfg(all(not(feature = "numpy-1"), feature = "numpy-2"))] | ||
impl_api![50; PyArray_CopyInto(dst: *mut PyArrayObject, src: *mut PyArrayObject) -> c_int]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since builds with support for versions will be common, I think this does not really add that much compile-time safety. I would rather suggest we extend the impl_api
macro to do a runtime version check for function which are only available in one or the other version, e.g.
impl_api![@npy1 50; PyArray_CastTo(out: *mut PyArrayObject, mp: *mut PyArrayObject) -> c_int];
impl_api![@npy2 50; PyArray_CopyInto(dst: *mut PyArrayObject, src: *mut PyArrayObject) -> c_int];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good! I'll make these changes.
@adamreichold I just pushed a new version. Here's a brief overview of breaking changes.
|
I don't think it is neccessary to change the API here. Since these are Python types we already have the proof that the GIL is held. You can just get the token via |
@Icxolu Good point! I made the changes that your suggested. @adamreichold Let me know if you would like me to squash the commits that modified src/dtype, since I ended up rolling back many of the edits. |
Sorry for not getting to this yet. We are in crunch mode at $DAYJOB until next week. Will look into as soon as I can. |
The changes are based on recommendations from https://numpy.org/devdocs/numpy_2_0_migration_guide.html#c-api-changes.
The most visible user-facing change is the addition of two feature flags (
numpy-1
andnumpy-2
). By default, both features are enabled and the code is compiled with support for both ABI versions (with runtime checks to select the right function offsets). Functions that are only available in numpy 1 or 2 are not exposed in this case. Disabling default features (for instancenumpy = {version = "0.21.0", default-features = false, features = ["numpy-1"]}
) exposes version-specific functions and fields but the library will panic if the runtime numpy version does not match.I have not done much testing, this should be tried on different code bases (ideally ones that use low-level field access) before merging.
This currently uses
std::sync::OnceLock
to cache the runtime version. I realised too late that this is not compatible with the Minimum Supported Rust Version (it was introduced in 1.70.0). Usingpyo3::sync::GILOnceCell
isn't straightforward sincepy
is not always available in functions that need to check the version to pick an implementation.