Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Depend on OpenJDK>8.0.0 for PySpark support #1701

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

billyvinning
Copy link
Contributor

Attempt to fix #1678.

A Java implementation (8/11/17) is required to run PySpark code snippets in the documentation, see #1678 for an example of a failing snippet. This change adds OpenJDK as a Conda dependency. Added OpenJDK to the list of packages to ignore when building the list of pip dependencies.

Add openjdk>8.0.0 to conda environment to fix failing PySpark code snippet
Ignore openjdk in requirements.txt generation script
Fix unstripped package names in requirements.txt builder script

Signed-off-by: Billy Vinning <[email protected]>
Copy link

codecov bot commented Jun 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.60%. Comparing base (812b2a8) to head (04f4d88).
Report is 113 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1701       +/-   ##
===========================================
- Coverage   94.28%   82.60%   -11.68%     
===========================================
  Files          91      117       +26     
  Lines        7013     8715     +1702     
===========================================
+ Hits         6612     7199      +587     
- Misses        401     1516     +1115     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cosmicBboy
Copy link
Collaborator

thanks @billyvinning !

This change is purely for the docs right? If so, I think installing openjdk (conda) or install-jdk (pypi) in the readthedocs config and in the CI: https://github.com/unionai-oss/pandera/blob/main/.github/workflows/ci-tests.yml#L145

Use `install-jdk` to install OpenJDK 11 Temurin as part of CI extra tests
and readthedocs env set up.
Remove `openjdk` from environment.yaml
Remove `openjdk` from exclusion list of pip deps generation script.

Signed-off-by: Billy Vinning <[email protected]>
@billyvinning
Copy link
Contributor Author

thanks @billyvinning !

This change is purely for the docs right? If so, I think installing openjdk (conda) or install-jdk (pypi) in the readthedocs config and in the CI: https://github.com/unionai-oss/pandera/blob/main/.github/workflows/ci-tests.yml#L145

Yes, you're right. In that case, let's try using install-jdk as part of the readthedocs post-install script.

Since you mentioned it, I did the same with the extras-tests job within the CI. A Java implementation ships with each workflow runner image but I noticed the version varies between platforms. So, let me know whether it is preferable to pin the Java version for the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pyspark_sql docs run time error
2 participants