Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: on demand docker environments #1796

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

saikonen
Copy link
Collaborator

@saikonen saikonen commented Apr 10, 2024

Adds support for on demand Docker images. Open for discussion

open todo's

  • add fallback to conda environment when executing locally with --environment docker
  • add support for imagebakery endpoint auth
  • cache bakery image tags locally to avoid environment drift with consecutive deployments when using loosely pinned versions

@saikonen saikonen marked this pull request as ready for review April 29, 2024 11:19
@savingoyal
Copy link
Collaborator

python for.py --environment=docker run without @kubernetes immediately returns an error that Image Bakery is not configured. Perhaps it shouldn't throw the error?

get_pinned_conda_libs,
)

BAKERY_METAFILE = ".imagebakery-cache"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the imagebakery-cache can also be made an optimization detail within docker_environment.py

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, can you help me understand the contents of this cache?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved caching to the environment side. what is in the cache at the moment is

  • a unique hash derived from the requested environment (python, packages, image-type etc.)
  • the image tag that was returned from the bakery endpoint for the request
  • we also persist the request payload that was sent to the bakery in order to be able to list details of cached images via docker cache list

import hashlib
import json
import requests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something of this sorts work here -

Suggested change
import json
import requests
class FastBakery(object):
def __init__(self, url):
self.url = url
def bake(
self,
python_ver=None, # TODO - Ensure the keys are same in the API spec and client
pypi_packages=None,
conda_packages=None,
base_image=None,
image_kind="esgz-zstd",
):
deep_clean = (lambda f: lambda d: f(f, d))(
lambda self, item: {
k: v
for k, v in (
(k, self(self, v)) if isinstance(v, dict) else (k, v)
for k, v in item.items()
)
if v is not None and v != {}
}
)
payload = deep_clean(
{
"pythonVersion": python_ver,
"imageKind": image_kind,
"pipRequirements": pypi_packages,
"condaMatchspecs": conda_packages,
"baseImage": {"imageReference": base_image},
}
)
return self.aws_iam_invoker(payload)
def aws_iam_invoker(self, payload):
try:
# AWS_IAM requires a signed request to be made
headers = {"Content-Type": "application/json"}
payload = json.dumps(payload)
import boto3
from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest
from botocore.exceptions import BotoCoreError, NoCredentialsError
import urllib
session = boto3.Session()
credentials = session.get_credentials().get_frozen_credentials()
# credits to https://github.com/boto/botocore/issues/1784#issuecomment-659132830,
# We need to jump through some hoops when calling the endpoint with IAM auth
# as botocore does not offer a direct utility for signing arbitrary requests
req = AWSRequest("POST", self.url, headers, payload)
SigV4Auth(
credentials, service_name="lambda", region_name=session.region_name
).add_auth(req)
response = requests.post(req.url, data=req.body, headers=req.headers)
# Manually check the status code and raise an exception if it's an error
if response.status_code >= 400:
raise requests.exceptions.HTTPError(
f"HTTP {response.status_code} Error: {response.reason}\nResponse Body: {response.text}",
response=response
)
return response
except requests.exceptions.HTTPError as e:
raise e
except (requests.exceptions.RequestException, BotoCoreError, NoCredentialsError) as e:
# Handle specific errors related to requests and AWS services
raise Exception(f"Failed to invoke AWS service due to network or AWS error: {e}")
except Exception as e:
# Handle other possible exceptions
raise Exception(f"An unexpected error occurred: {e}")

self.bake_image_for_step(step)
echo("Environments are ready!")
if self.steps_to_delegate:
# TODO: add debug echo to output steps that required a conda environment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these comments too

self.delegate.validate_environment(echo, self.datastore_type)
self.delegate.init_environment(echo, self.steps_to_delegate)

def bake_image_for_step(self, step):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple steps might share the same image - maybe we can check if that is the case and only bake unique images?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the images are baked in sequence so the consecutive steps with identical packages should be hitting the cache at this point, skipping the baking. previously the caching was happening inside the fast_bakery, but moved to the environment side now so it should be more clear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than baking in sequence, we can perhaps bake them in parallel - that should save precious seconds

@@ -23,6 +24,9 @@
DEFAULT_SECRETS_BACKEND_TYPE,
AWS_SECRETS_MANAGER_DEFAULT_REGION,
S3_SERVER_SIDE_ENCRYPTION,
FAST_BAKERY_URL,
FAST_BAKERY_AUTH,
FAST_BAKERY_TYPE,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can avoid introducing these as env vars here and in @kubernetes with the expectation that using fast bakery may involve using @secrets or @environment - to configure the URL and auth. The client can try to read from env vars unless explicitly configured with the URL and auth?

# a conda decorator always exists alongside pypi so this needs to be accounted for
dependency_deco = pypi_deco if pypi_deco is not None else conda_deco
if dependency_deco is not None:
python = dependency_deco.attributes["python"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python comes from conda_deco

@@ -65,7 +65,7 @@ def validate_environment(self, echo, datastore_type):
micromamba = Micromamba()
self.solvers = {"conda": micromamba, "pypi": Pip(micromamba)}

def init_environment(self, echo):
def init_environment(self, echo, only_steps=[]):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romain-intel I'd like your thoughts on this mechanism.

The reason we need something akin to this is to support the environment fallback logic in docker_environment.py.

  1. docker env gathers a list of steps that it can not act on (conda/pypi but not executing remotely)
  2. for the incompatible steps we want to fallback to the conda implementation, and initialize accordingly
  3. the only_steps is meant to skip those steps that we were able to bake a docker image for

Would this be a good addition to the environment init flow. Any better alternatives to this?

@@ -64,7 +64,7 @@ def step_init(self, flow, graph, step, decos, environment, flow_datastore, logge
# one here. We can consider introducing a UX where @pypi is able to consume
# poetry.lock files in the future.

_supported_virtual_envs = ["conda"] # , "venv"]
_supported_virtual_envs = ["conda", "docker"] # , "venv"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romain-intel this relates to discussions on decorator improvements. Should this be reliably extendable through extensions?

Main motivation is that this whole imagebakery feature could be a standalone extension, except for the few additions it needs to make to the OSS PyPI plugin.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_supported_virtual_envs is just an implementation detail here, but what I'm pitching is adding a mechanism where a newly introduced environment could add itself to a whitelist for known decorators.

@saikonen
Copy link
Collaborator Author

the changes allow an extension to implement

def conda_supported_virtual_envs():
    return ["docker"]

in org_ext/config/mfextinit_org_ext.py
to support --environment docker for @pypi and @conda


@saikonen
Copy link
Collaborator Author

also possible to fallback to a Conda environment for steps that can not be handled by another env implementation:

def init_environment(self, echo):
    self.delegate = CondaEnvironment(self.flow)
    ...
    if self.steps_to_delegate:
        self.delegate.validate_environment(echo, self.datastore_type)
        self.delegate.init_environment(echo, self.steps_to_delegate) # skips steps that were already handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants