Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 667: correct/clarify handling of PyEval_GetLocals #245

Open
ncoghlan opened this issue Jun 2, 2024 · 27 comments
Open

PEP 667: correct/clarify handling of PyEval_GetLocals #245

ncoghlan opened this issue Jun 2, 2024 · 27 comments

Comments

@ncoghlan
Copy link
Contributor

ncoghlan commented Jun 2, 2024

While writing the docs for python/cpython#74929 and reviewing python/cpython#118934, I discovered a discrepancy in the way PEP 667 described the updates affecting PyEval_GetLocals (what the text described didn't match the proposed implementation given).

After discussing it with @gaogaotiantian in python/cpython#118934 (comment) I'm happy that the discrepancy should be resolved in favour of the proposed implementation, but I'd also like the SC to ratify that since we're arguably changing a user-facing detail of an accepted PEP.

python/peps#3809 updates the PEP text to resolve the inconsistency (since it's referenced from the Python 3.13 What's New and leaving it as is might be confusing for users of PyEval_GetLocals)

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Jun 3, 2024

See python/cpython#119769 for further discussion (it may be possible to have PyEval_GetLocals() replicate the Python 3.13 locals() semantics rather than having to settle for it being the one place that retains 3.12'ish behaviour)

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Jun 5, 2024

We've now identified a few potential resolutions for the discrepancy, and could benefit from a release-manager-or-SC casting vote to resolve a disagreement between @gaogaotiantian and I as to which of them is the least bad option for Python 3.13 (we wouldn't categorise any of them as good options - there are multiple reasons the PEP deprecated this API!).

  • update PyEval_GetLocals() to Python 3.13 locals() semantics, accumulate multiple borrowed references in a list cache (all snapshots taken this way remain alive as long as the frame is alive). It technically eliminates the reference leak, but it doesn't feel like it is really eliminating it (just limiting how bad it can get).
  • update PyEval_GetLocals() to Python 3.13 locals() semantics, use Py_XSETREF and invalidate any previously borrowed references for that frame on each new call to PyEval_GetLocals()
  • leave PyEval_GetLocals() on the Python 3.12 locals() semantics, document that subsequent calls may affect the contents of references returned by previous calls on that frame

The current state of main and the 3.13 branch is that the documentation reflects the third option (as when I wrote it, I hadn't even considered the possibility of making the Python 3.13 semantics work when returning a borrowed reference), while the API itself leaks strong references to fresh snapshots in optimised frames (as reported in python/cpython#118934 ). The PR for the reference leak issue at python/cpython#119769 currently implements the Py_XSETREF approach (but there are code suggestions for the other two options posted in a pair of comments ).

Where @gaogaotiantian and I are disagreeing is in how we weigh the value of achieving internal semantic consistency between APIs in Python 3.13 vs potentially introducing a risk of segfaults in unlikely-but-historically-valid code that calls PyEval_GetLocals() a second time on a frame (presumably indirectly), but then keeps using the borrowed reference returned by the earlier call.

Neither of us are fans of the first option listed (although we do accept it's a valid option), but I favour tolerating the legacy Python 3.12 semantics in PyEval_GetLocals() until we get SC approval to hard deprecate and eventually remove it, while Tian would like us to aim higher and seek internal semantic consistency in 3.13 for all locals() access, even via the legacy PyEval_GetLocals() API.

@markshannon
Copy link
Member

I think we should to stick to the PEP as written. In other words:

  • PyEval_GetLocals() should return a borrowed reference to a FrameLocalsProxy. If called multiple times for the same frame it will return the same proxy. https://peps.python.org/pep-0667/#id3

There are a few reasons why:

  • The implementation is simple and doesn't have the sharp edges of the other proposals. I.e. no risk of segfaults.
  • I don't think there is any strong reason for keeping PyEval_GetLocals() consistent with locals(). There is nothing in the docs about them being the same.
  • It's deprecated anyway, so we might as well pick the simplest solution (which is the one in the PEP).

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Jun 7, 2024

That resolution sounds like a good option to me. I thought it had been ruled out during in-person discussions in Pittsburgh in favour of the sample code defining it in terms of PyEval_GetFrameLocals() (hence it not being in the list of options above), but if that isn't the case, it's straightforward to amend the PEP's example code to use PyFrame_GetLocals() on the result of PyEval_GetFrame() and bring the PEP in sync with itself that way.

The docs for that approach also mostly exist already, so I can largely pull them from the commit history and have them ready again for beta 3.

@gaogaotiantian
Copy link
Member

gaogaotiantian commented Jun 7, 2024

There is a serious problem to return a FrameLocalsProxy from PyEval_GetLocals. Users currently using PyEval_GetLocals expects a dict, and returning a FrameLocalsProxy will not decrease the chance that they crash. Actually I'm quite sure it will increase the possibility. The solution seems simple and beautiful, but it's against the "least surprise principle".

The fundamantal issue here is - it's a C API, not a Python one. Even in Python, we kept the behavior of locals() to return a dict to minimum the impact on users and only changed frame.f_locals. If we return a FrameLocalsProxy instead of a dict in C API, we will crash a lot of libraries that assert the return value is a dict.

Because they would use PyDict_* APIs to the return value which is not a dict. For example, PyDict_GetItemString assumes the input to be a dict, so it calls dict_getitem without any checking, which casts the input to PyDictObject and accesses it's ->ma_keys. That's just a simple segfault.

And this is a much much more common pattern in the market than the "possible segfault" when the user calls PyEval_GetLocals multiple times and try to use the first return value.

For example, my project VizTracer (4k+ stars) uses this and the change will just crash my code.

And it's not that I'm stupid, yappi (1.4k stars) has the similar pattern

That's just some quick search and I believe there are more than a few popular libraries that would suffer from this.

It's also possible that they want to Py_INCREF the returned dict and use that as the local scope for exec, that would be another disaster, because then the other frame would be writing random stuff to the current frame.

Because of the reasons above, I don't believe the proposal to make PyEval_GetLocals return a FrameLocalsProxy is an acceptable solution.

@gaogaotiantian
Copy link
Member

Some quick summary of why I favored the second option:

  1. It worked in the majority of cases.
  2. For the rare cases, it failed explicitly, which may drive the user to check the documentation (most user won't follow the changes of the Python APIs) and update their code with the modern API. Hiding them secretly won't help in that case.
  3. The users will suffer from the changes of locals(), the changes are subtle but critical. They will need to learn how to live with the new semantics. I think it would be better to contain the suffering in one version update, instead of spreading it over the next few until the API is fully removed. If we keep the semantics of 3.12, we will have a C API doing things that none of Python code would do, and the inconsistency will last for a couple of versions.

@markshannon
Copy link
Member

returning a FrameLocalsProxy will not decrease the chance that they crash.

Hopefully, it will increase the chance of a crash. Crashes are easy to diagnose and fix. I'm more worried about hard to spot changes in semantics.

That PyDict_GetItemString takes a PyObject, and doesn't check it, seems a separate issue worth fixing.

The yappi code is already full of #ifdefs for versions. C extensions that interact with CPython low level features need to be updated for new versions.

For a profiler, I would think that considerably reduced overhead of the new API would be well worth the cost of moving to the new API anyway.

You seem to think it is possible to avoid breaking code. It isn't, because locals() and f_locals were already broken.
What we want is for the breakages to be easy to spot and easy-ish to fix.

Trying to hide the breakages will make thing worse in the long term, IMO.
Consistency is more important.

It's also possible that they want to Py_INCREF the returned dict and use that as the local scope for exec, that would be another disaster, because then the other frame would be writing random stuff to the current frame.

This is already quite broken in 3.11 as any call to a trace function could have updates the locals anyway via PyFrame_LocalsToFast.

@gaogaotiantian
Copy link
Member

I picked those two repos as an example, just to illustrate the impact of the change. There are a lot of others that have similar issues.

Hopefully, it will increase the chance of a crash. Crashes are easy to diagnose and fix. I'm more worried about hard to spot changes in semantics.

If that's the case, we should remove the API directly so that everyone just crashes and they are required to find a new solution. Because anyone that really uses the return value of this API will be crashed with the proposed change (assuming the return value is a dict simply will not work).

We kept this API and allow a deprecation phase because we want our users to be less disturbed. I agree that

Consistency is more important.

That's why I don't like the idea of making it behave like 3.12. Then we will have 3 behaviors - 3.12, locals() and f_locals. But I also think the obvious correspondance between

PyEval_GetLocals
PyEval_GetGlobals
PyEval_GetBuiltins

and

PyEval_GetFrameLocals
PyEval_GetFrameGlobals
PyEval_GetFrameBuiltins

should give user the consistence as well - the latter ones are the strong reference version of the former ones. The PyEval_GetFrameLocals() is a new API so we obviously approve of its behavior. Why not make it consistent? It does not make sense to me to make PyEval_GetLocals() deviate to the behavior of PyFrame_GetLocals(PyEval_GetFrame()).

I also agree that we should fail fast when the user is doing something that's not supported well. That's why I chose 2 over 1, but I think we should try to make user not crash.

I don't believe we can, with some efforts, save the users so all of their code will work. But I believe in many simple cases, it can work just as before, with the semantic changes of locals().

@encukou
Copy link
Member

encukou commented Jun 11, 2024

Hopefully, it will increase the chance of a crash. Crashes are easy to diagnose and fix. I'm more worried about hard to spot changes in semantics.

I am baffled by this statement.
PyEval_GetLocals is deprecated; the reason it remains is to make existing code work.
If you want to fix code, heed the deprecation warning and don't call the function.

You seem to think it is possible to avoid breaking code. It isn't, because locals() and f_locals were already broken.
What we want is for the breakages to be easy to spot and easy-ish to fix.

However quirky, underdocumented and undersupported the previous API was, it was possible to use it correctly -- for example, by avoiding tracing, or by figuring out what the lifetime of some borrowed reference was in practice.
For applications that relied on the old behaviour and avoided the ugly cases, it is this PEP's changes that will break them. Not past design mistakes.

@ncoghlan
Copy link
Contributor Author

If you want to fix code, heed the deprecation warning and don't call the function.

I'll note that PyEval_GetLocals is the one API where I was disappointed we weren't able to get a hard deprecation approved by the SC. At the moment, there's nothing in extension module C builds that says "Here be venomous snakes" where PyEval_GetLocals is concerned, it's only a soft deprecation in the docs. Since none of the design options available for this API are guaranteed to be the right thing to do for all possible usage scenarios (hence the difficulty in selecting the least bad option), that lack of a programmatic warning at build time is genuinely concerning.

As far as the risk of segfaults arising from passing proxy instances to PyDict_ APIs goes, I'm not sure PyEval_GetLocals is at substantially more risk of that than PyFrame_GetLocals is. It's also a concern that existed in the PEP text as approved, so it's presumably an outcome that the SC were already prepared to tolerate.

One idea I did have that would make me more in favour of @gaogaotiantian's approach (i.e. making previously borrowed references expire on each call to PyEval_GetLocals()) is if we could at least emit a hard deprecation notice or a runtime warning when PyEval_GetLocals() is called on a frame and clears a previously borrowed reference. That way the segfault from accessing the borrowed reference after the strong reference has been lost should at least be easier to debug than it otherwise would be.

@markshannon
Copy link
Member

Having thought about this some more. Here's an approach that will hopefully keep everyone happy, with almost no change for users if they keep using locals() and PyEval_GetLocals, but still in the spirit of PEP 667.

General idea

  • The old API, both C and Python, always return dicts
  • The new API, both C and Python, should return proxies for the locals of (optimized) functions, dicts otherwise.
  • locals() and PyEval_GetLocals() will return the same dict (per frame) that is updated from f_locals

The old and new APIs

New API

Python

  • frame.f_locals is treated as part of the new API, even though it pre-dates PEP 667.

C

  • PyEval_GetFrameLocals(void)
  • PyEval_GetFrameGlobals(void)
  • PyEval_GetFrameBuiltins(void)

The old API

Python

  • locals()

C

  • PyEval_GetLocals()
  • PyEval_GetGlobals()
  • PyEval_GetBuiltins()

Implementation of locals() and PyEval_GetLocals()

Both locals() and PyEval_GetLocals() should behave the same as each other, and PyEval_GetLocals() should still return a borrowed reference.

Rather than returning an instantaneous snapshot of frame.f_locals,
locals() should return the same dict every time, and it should be updated from frame.f_locals on every call to locals() or PyEval_GetLocals().

In other words locals() should be defined as:

def locals():
    frame = sys._getframe(1)
    f_locals = frame.f_locals
    if not isinstance(f_locals, FrameLocalsProxy):
        return f_locals
    if frame.locals_cache is NULL:
        frame.locals_cache = {}
    frame.locals_cache.update(f_locals)
    return frame.locals_cache

The only change in behavior from 3.12 is that stores to frame.f_locals will overwrite priors stores to locals() when locals() is called again.

@markshannon
Copy link
Member

markshannon commented Jun 12, 2024

The downside of this is that it creates more cycles, but no more than 3.12.

We shouldn't worry about the old API creating cycles.
We should, as @ncoghlan says, make the deprecation of the old API louder.

@carljm
Copy link
Member

carljm commented Jun 12, 2024

locals() is not deprecated (in fact in a sense it's the most "public" of all of these APIs), so I don't understand why we would want to treat it as part of the "old API" or give it these awkward legacy semantics. I thought the whole point of PEP 667 was largely to give locals() better semantics. I think locals() should be consistent with the new recommended C API, not the old deprecated C API.

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Jun 12, 2024

As @carljm suggests, my own preferred option is to have PyEval_GetLocals be the only API that perpetuates the legacy Python 3.12 semantics, and explicitly deprecate it with folks being pointed towards the replacements that return strong references and have unambiguous semantics.

No segfault risks, no increase in potential memory usage relative to Python 3.12, and it confines the legacy behaviour to a now inherently ambiguous API which has clear replacements that unambiguously implement the new semantics (PyEval_GetFrameLocals() to mimic locals(), or PyEval_GetFrame() followed by PyFrame_GetLocals() to mimic frame.f_locals).

We would still eventually get to the fully consistent state with only the new semantics being available, it would just take a couple more releases than @gaogaotiantian hoped it would (presumably 3.15 if we keep PyEval_GetLocals for the minimum two releases after a hard deprecation).

Losing the clarification of locals() semantics provided by the PEP just because a C API we want to get rid of anyway returns a borrowed reference would be a bigger step backwards than we need to make. The parallel between PyEval_GetLocals() and locals() is inherently weak, as historically both of those and frame.f_locals implemented the same semantics, so it was never necessary to articulate which of the two Python level operations the C API was equivalent to (I checked when I started working on the PEP 667 C API docs changes, and its docs have always just referred to returning "a dictionary of the local variables in the current execution frame" without citing any equivalence with particular Python code).

Simply saying "PyEval_GetLocals() is equivalent to locals()" isn't correct in the general case, since it could mean either that or acessing frame.f_locals for the running frame in existing C code, and the only way to know the intent now that frame.f_locals and locals() are going to mean different things is to look at the actual context of use. That means every use of PyEval_GetLocals() needs to be inspected and a choice made as to whether it means locals() or frame.f_locals. A hard deprecation while retaining the existing API behaviour is the cleanest option we have available to allow time for that code inspection to take place across affected projects, while still giving unchanged code a reasonable chance of continuing to work (especially if the value returned is never mutated).

This ambiguity in intent doesn't apply to locals(), since locals() and frame.f_locals have always been spelled differently in Python code (hence both PEP 667 and PEP 558 before it choosing that difference as the way to distinguish between requests for read-only access and requests for read/write access)

@gaogaotiantian
Copy link
Member

I agree that we should not revert the behavior for locals() - that's a good change and we should keep that. I would rather PyEval_GetLocals() to have a legacy and confusing behavior than making locals() confusing again.

I think from the discussion above, even though we can't fully agree on which is the best way for PyEval_GetLocals(), we seem to be okay with the "less favored" option. We do have something in common - we want a hard deprecation on PyEval_GetLocals(), which we still don't have from the SC. Is it still possible to get that for 3.13 so we can remove the API in 3.15 instead of 3.16? Also if we get that, the behavior for the API that's being "deprecated" is not that important anymore. Otherwise it would be an officially supporting API and users can rely on that without hesitation.

@markshannon
Copy link
Member

OK, so it sounds like the consensus is to keep the new behavior for locals().

That leaves PyEval_GetLocals() and PyEval_GetFrameLocals() to fix.

PyEval_GetFrameLocals() should return a FrameLocalsProxy, as I intended in the PEP (although that isn't clear now that I reread it).

Any objections to my suggestion for making PyEval_GetLocals() behave more like it does in 3.12?

@gaogaotiantian
Copy link
Member

gaogaotiantian commented Jun 13, 2024

Any objections to my suggestion for making PyEval_GetLocals() behave more like it does in 3.12?

It's not my first choice, but I find it acceptable, as long as we can mark it deprecated in 3.13. If not, well it's still not unacceptable but I think that would cause more chaos in the future.

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Jun 15, 2024

I would still like to have a C API that's a direct mirror of locals() (as it makes the deprecation notice for PyEval_GetLocals much easier to write), but I also really didn't like documenting that PyEval_GetFrameLocals() didn't mean the same thing as calling PyFrame_GetLocals() on the result of PyEval_GetFrame() (as is currently the case).

Given the naming pattern used in other parts of the API where an API returning a borrowed reference (e.g PyList_GetItem) is supplemented with one returning a strong reference (e.g. PyList_GetItemRef), how would folks feel about the following?:

  • PyEval_GetLocals(): keeps its Python 3.12 behaviour (but is hard deprecated for future removal in 3.15)
  • PyEval_GetLocalsRef(): new API, C API equivalent of calling locals()
  • PyEval_GetFrameLocals(): new API, C API equivalent of acessing frame.f_locals (shorthand for calling PyFrame_GetLocals() on the result of PyEval_GetFrame(), aligning with both the other shorthand functions for the globals and builtins and the most obvious interpretation of its name)

The deprecation notice on PyEval_GetLocals in the documentation would provide pointers to the two potential replacements, with the choice in any given situation being based on which set of API semantics is most appropriate for the use case.

@gaogaotiantian
Copy link
Member

I also believe a C API that behaves like locals() should be there. I don't believe PyEval_GetFrameLocals should be frame.f_locals, even though I can understand why Alyssa thinks that's more consistent.

At this point, I'm a bit tired arguing about PyEval_GetLocals(), so I'm fine with any solution that gives it a hard deprecation on 3.13. However, I have some doubts to have a PyEval_GetLocalsRef() that behaves differently than PyEval_GetLocals() for two versions. I find that very confusing.

@ncoghlan
Copy link
Contributor Author

However, I have some doubts to have a PyEval_GetLocalsRef() that behaves differently than PyEval_GetLocals() for two versions. I find that very confusing.

It isn't ideal, but it's important to remember that:

  1. Existing users of PyEval_GetLocals() are necessarily already going to be at least somewhat familiar with its quirks
  2. New API users will be pushed to one of the better defined alternatives, and encouraged to ignore the old API and its quirky behaviour
  3. The behaviour isn't that different. You can only observe a difference if another piece of code calls PyEval_GetLocals() on the same frame, and either mutates the result directly, or else calls it after the frame state and the cached dict have diverged (such that dict gets overwritten with the updated frame state).

The only other potential name for the C API locals() equivalent that has occurred to me is PyEval_GetScopeLocals(). That idea is based on the updated documentation referring to optimized scopes rather than optimized frames when defining the behaviour of locals(), since variable scopes are a required language feature, while execution frames and their related APIs are technically an implementation detail.

@gaogaotiantian
Copy link
Member

I'm not super opposed to the idea. But to me, it feels like this:

locals() is a single call from the current frame, corresponding to PyEval_GetFrameLocals(). sys._getframe().f_locals is two calls, corresponding to PyEval_GetFrame then PyFrame_GetLocals.

The "inconsistency" in C APIs reflects the "inconsistency" in Python. It's not the same to call locals() directly vs get the frame and access f_locals - they used to be the same. Thus, it's not the same to call PyEval_GetFrameLocals() directly vs get the frame by PyEval_GetFrame then do PyFrame_GetLocals on the frame. Personally I think the inconsistency here is consistent with the Python behavior.

For most of our C API users, that probably makes more sense - most of them are looking for the Python equivalent.

@gaogaotiantian
Copy link
Member

From the discussion in the discord, the C API might have implicit impacts on exec() as well. exec() uses the C API and we need to figure out what to feed it. (now it's basically locals() and I think it makes sense)

@ncoghlan
Copy link
Contributor Author

Note that the beta2 docs cover the impact on exec() and other code execution APIs: https://docs.python.org/dev/whatsnew/3.13.html#whatsnew313-locals-semantics

It's also called out explicitly in the porting notes: https://docs.python.org/dev/whatsnew/3.13.html#pep667-porting-notes-py

Affected functions further have this note added to them:

Changed in version 3.13: The semantics of the default locals namespace have been adjusted as described for the locals() builtin.

This change was something PEP 667 inherited from PEP 558, so the primary notification of this change in the PEPs themselves is actually in PEP 558: https://peps.python.org/pep-0558/#what-happens-with-the-default-args-for-eval-and-exec

@gaogaotiantian has started a PR to make this impact of the PEP 667 changes clearly in PEP 667 itself: python/peps#3845

@pablogsal
Copy link
Member

pablogsal commented Jun 19, 2024

Some clarifications from the Steering Council regarding this discussion: we have been discussing the PEP and the issue and reviewing the conversation so far and here are some points we want to make or clarify:

  • We think that as PyEval_GetLocals() it's part of the stable ABI it should continue returning dictionaries. The semantics of what happens when mutating the dict can change as per the PEP.
  • locals() should follow the new semantics written in the PEP as currently stands.
  • For the rest of the functions, we think the PEP should be updated to explicitly mentioned that the APIs are backwards incompatible. Notice that there is no mention of the backwards compatibility concerns in the "Backwards compatibility" section and references to this are indirect at best.
  • As the PEP says that many of these functions functions are deprecated, they should emit a deprecation warning at compile time as per PEP 387.

Please, comment here or reach out to us if something in these points its not clear or if you want us to consider something else.

In representation of the SC,
Pablo

@gaogaotiantian
Copy link
Member

Thank you for the comments from the SC, it seems like at least we got the deprecation - that's great. If we can get a deprecation warning at compile time, I'm okay with either option. If we keep the same semantics for PyEval_GetLocals() as 3.12, then we don't have backwards incompatible C APIs right?

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Jun 21, 2024

Thanks folks. I think that means the beta2 docs are correct as written, but I'll double check them to make sure.

@gaogaotiantian As far as the backwards compatibility section in the PEP goes, PyEval_GetLocals() ends up in a slightly weird spot, since it is technically unchanged, but because it becomes the sole user of the per-frame shared cache, it also doesn't quite have the same capabilities that it used to have (primarily the fact you can't use it to make additional keys visible to locals() or frame.f_locals anymore, you have to switch to the new frame proxy API to do that).

My PR at https://github.com/python/peps/pull/3809/files covers that (as well as noting that the PEP was ambiguous about this when it was first accepted), so if you approve that as a PEP co-author, we can merge it to cover that part of the SC's request.

@ncoghlan
Copy link
Contributor Author

ncoghlan commented Jun 21, 2024

@pablogsal Just confirming this point, since the SC comment didn't specifically address it, although it had come up earlier in the thread as something to consider changing:

  • the C API equivalent of locals() will be PyEval_GetFrameLocals()
  • the C API equivalent of accessing frame.f_locals will be calling PyFrame_GetLocals() on the result of PyEval_GetFrame()

(This is what the accepted PEP specified, and is what is currently implemented and documented, so leaving it the way it is would be entirely reasonable. @markshannon just noted during this discussion that his intention had been for PyEval_GetFrameLocals() to be a shorthand for calling PyFrame_GetLocals() on the result of PyEval_GetFrame(), but he missed that wasn't what the final PEP text actually specified when it was submitted for approval)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants