Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gazelle: Sanitize distribution names in modules mapping / manifest, not at runtime #1939

Open
wingsofovnia opened this issue Jun 3, 2024 · 6 comments
Labels
gazelle Gazelle plugin related issues

Comments

@wingsofovnia
Copy link

馃殌 feature request

Relevant Rules

  • modules_mapping
  • gazelle_python_manifest
  • gazelle

Description

Gazelle plugin uses gazelle_python.yaml (gazelle:python_manifest_file_name) to derive deps for py targets. One might expect that a dependency label will be ${manifest.pip_repository}//${manifest.modules_mapping.some.import}, however, Gazelle does extra sanitisation right before using the mapping value here.

Although it is not a big deal when used with rules_python, it makes harder to reuse Gazelle plugin with other python rules, more specifically rules_pycross, which has a different target naming convention, which is aligned with Python normalised names. In this ticket, there is an approach (or hack really) described for patching gazelle_python.yaml but further sanitisation that happens inside the plugin itself makes it impossible to use.

I do understand that compatibility with other rules most likely isn't of highest priority, here are some other reasons:

  • It would be slightly less confusing when target name derivation (normalisation / alignment) happens in once place, rather than in modules_mapping and/or gazelle_python_manifest and/or plugin itself.
  • Given that there are ongoing efforts to Pull rules_pycross into rules_python聽#1360, it might be a good improvement going forward.

Describe the solution you'd like

  • Do not SanitizeDistribution in pythonconfig.Config#FindThirdPartyDependency, take the value as is.
  • Do SanitizeDistribution in modules_mapping, or gazelle_python_manifest. I'd say modules_mapping is best but please advice.
@aignas aignas added the gazelle Gazelle plugin related issues label Jun 4, 2024
@aignas
Copy link
Collaborator

aignas commented Jun 10, 2024

A few thoughts here before I forget :)

I think that returning the mapping of import to python package from modules_mapping is still a good idea, because given a whl file the returned values will be the same, no matter what hub repo the file is used in (this is useful for bigger monorepos where there could be many requirements files).

Being able to have a gazelle_python.yaml file (or a directive) that allows us to configure the gazelle plugin to not rely on any naming convention would be ideal, but not sure if it is possible here. Having target_convention: @pypi//{distribution}.replace("-", "_") in the YAML could be a great way for the user to control that, but I am not sure it is feasible. Having @pypi//:{distribution} is another alternative that could work. That would at least fix the @pypi//:{distribution} vs @pypi//{distribution} differences.

Regarding - vs _ it does sound like it would be best to converge to a single solution, but I am not sure what that is. There has been very little work done on #1360 front and I am not sure how it will work out in the future.

Being devil's advocate, you could also argue that changing the gazelle plugin to ensure that rules_pycross integrates well with the rules_python gazelle plugin is a code smell. Creating a wrapper around rules_pycross hub repo, that would externally look like rules_python one is not too difficult (but tedious) if there is a known naming scheme. So doing that could be another workaround to consider.

@wingsofovnia
Copy link
Author

Creating a wrapper around rules_pycross hub repo, that would externally look like rules_python one is not too difficult (but tedious) if there is a known naming scheme. So doing that could be another workaround to consider.

There was an attempt here jvolkman/rules_pycross#89. It gives some clues why the author decided not to accept it. Basically, it is inverse to your point that pycross ideas shouldn't be bend for the sake of satisfying another plugin.

You could also argue that changing the gazelle plugin to ensure that rules_pycross integrates well with the rules_python gazelle plugin is a code smell.

I somewhat agree, but someone should make a first step unless the goal is to keep Gazelle plugin a closed garden I guess :) I'd say it should be the side where it gives a decoupling effect (Gazelle plugin) rather than an ugly bridge/adapter (rules_pycross).

Perhaps I see it is incorrectly, but gazelle_python.yaml to me is/should be just a lookup dict for Gazelle plugin. While I agree it is better to keep modules_mapping as is, I'd see any target transformation/normalization happening on gazelle_python.yaml generation which the plugin should afterwards take as a 1:1 mapping and refrain from adding any configuration to alter runtime behaviour there. Gazelle directives sounds like a more common/appropriate place to alter Gazelle (incl. plugins) behaviour to me.

How about having a target_convention attr to gazelle_python_manifest? Something like:

gazelle_python_manifest(
    name = "requirements_mapping",
    modules_mapping = ":requirements_metadata",
    pip_repository_name = "pip",
    target_convention = "{{ printf "%s//:%s" $pip_repository_name (regexReplaceAll "[-._]+" (lower $target) "-") }}",
    requirements = ":requirements_txt",
)

with default value of {{ printf "%s//%s" $pip_repository_name (regexReplaceAll "[_\\.]" (lower $target) $target "_") }}. pip_repository_name attr might also be deprecated here to fully use target_convention while keeping backwards compatibility. target_convention can use text/template with extra functions like regexReplaceAll from sprig, or it could be some custom simplified version of it.

@aignas
Copy link
Collaborator

aignas commented Jun 10, 2024 via email

@wingsofovnia
Copy link
Author

wingsofovnia commented Jun 11, 2024

What I was trying to say w.r.t. gazelle_python.yaml is to keep it a simple mapping, rather than a using gazelle_python.yaml as a plugin configuration. I've suggested to put the config into gazelle_python_manifest rule, but in fact most plugins I've seen would probably use directives, including Gazelle itself. This plugin has also a long track of using directives with #1819 being the latest addition.

I like the idea of normalization enum. It could be:

Alternatively, it could be convention [used by the plugin I (Gazelle) generate code for].

However, I would put it into a directive, rather than gazelle_python.yaml:

### rules_python
# gazelle:label_format $repo//$distribution
# gazelle:label_distribution_normalization snake_case

### rules_pycross
# gazelle:label_format $repo//:$distribution
# gazelle:label_distribution_normalization pep-503

No changes to gazelle_python.yaml needed. Backwards compatible and looks equally non-invasive. What do you think?

@aignas
Copy link
Collaborator

aignas commented Jun 12, 2024

Sounds great! Keeping gazelle_python.yaml the same is even better.

Bike shedding a little here:

  • I am wondering if pep503 or pep-503 is better. I see it written without a dash most often, so that would be my preference.

@wingsofovnia
Copy link
Author

wingsofovnia commented Jun 18, 2024

pep503 it is, @aignas 馃檪
I filed a PR #1976, would be great to get your feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gazelle Gazelle plugin related issues
Projects
None yet
Development

No branches or pull requests

2 participants