Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async tracking #970

Merged
merged 6 commits into from
Jun 27, 2024
Merged

async tracking #970

merged 6 commits into from
Jun 27, 2024

Conversation

elijahbenizzy
Copy link
Collaborator

@elijahbenizzy elijahbenizzy commented Jun 20, 2024

Async tracking client:

from hamilton_sdk import tracker
        tracker_async = adapters.AsyncHamiltonTracker(
            project_id=1,
            username="elijah",
            dag_name="async_tracker",
        )
        dr = (
            await h_async
            .Builder()
            .with_modules(async_module)
            .with_adapters(tracking_async)
            .build()
        )

Changes

Async hooks + tracking client

How I tested this

Unit tests, manually (will be adding more.

Notes

New async builder -- have to call await h_async.Builder()....build()

TODO:

  • Add better description here
  • Clean up code
    • Multiple commits
  • Batch mode
    • Transfer over batch mode impl from sync version
    • Figure out how to not have init, or make it simpler
  • Get it working with builder
    • API -- decide on an easy integration (.as_async() in driver.Builder is current pref)
  • Add any relevant documentation
    • How to use it
    • Caveats
    • Docstrings
  • Determine whether I want .ainit() or not...
  • Add example into the async example
  • Add more unit tests

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

@elijahbenizzy elijahbenizzy force-pushed the async-tracking branch 7 times, most recently from 6a4a6b1 to 362f645 Compare June 25, 2024 04:45
@elijahbenizzy elijahbenizzy force-pushed the async-tracking branch 3 times, most recently from b894880 to 24bfb6d Compare June 25, 2024 05:23
Fixes some issues around asynchonrous hooks.

This also adds an API for whether we have a synchronous *or* an
asynchronous hook. Adds an async API for calling all lifecycle hooks whether
they are sync or async.
@elijahbenizzy elijahbenizzy changed the title WIP for async tracking async tracking Jun 25, 2024
@elijahbenizzy elijahbenizzy marked this pull request as ready for review June 25, 2024 19:03
Copy link
Collaborator

@skrawcz skrawcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably schedule time to walkthrough this -- e.g. how to fix not flushing properly...

@elijahbenizzy
Copy link
Collaborator Author

elijahbenizzy commented Jun 26, 2024

we should probably schedule time to walkthrough this -- e.g. how to fix not flushing properly...

Sure. Flushing does work properly. It occasionally flushes after the end of a request rather than before it, but that's expected behavior.

@elijahbenizzy elijahbenizzy force-pushed the async-tracking branch 3 times, most recently from 1372f82 to becae6a Compare June 26, 2024 14:30


@asynccontextmanager
async def lifespan(app: fastapi.FastAPI):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a second endpoint to compare latency/penalty for the tracker.

1. Adds a builder API specifically for the asynchonous driver
2. Fixes up the ASyncGraphAdapter to call hooks in a clean way

This does something clever -- the graph adapter takes in the hooks,
which then run async while the function is running. Synchronous hooks
will work, but they won't be called at quite the right time.
This has a basic queuing system -- we laucnh a periodic task that runs
and flushes out a queue -- it times out at the flush interval and
flushes everything it has. This will often flush after a request is done
but it will always flush.
"Please pass in a single result builder"
)
elif len(result_builders) == 0:
result_builders = [base.DictResult()]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO remove this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the next line None if there are none

But only if the server is running
@elijahbenizzy elijahbenizzy force-pushed the async-tracking branch 3 times, most recently from 9ebd205 to 687a5f0 Compare June 27, 2024 00:01
Its time we support this.

Decisions:
1. Did not want to keep it in hamilton.driver, that's getting too
   bloated and there's name collision
2. Moved over docs to point to the right place (as much as I could)
3. Left it backwards compatible with a warning to update the import
@elijahbenizzy elijahbenizzy force-pushed the async-tracking branch 3 times, most recently from 58a3936 to cdc56cf Compare June 27, 2024 13:12
@elijahbenizzy elijahbenizzy merged commit 2bf6aa1 into main Jun 27, 2024
22 checks passed
@elijahbenizzy elijahbenizzy deleted the async-tracking branch June 27, 2024 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants