-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: push events data from processor to unique users data collector #4827
base: mihir/pipe-1200
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## mihir/pipe-1200 #4827 +/- ##
===================================================
- Coverage 74.39% 74.36% -0.03%
===================================================
Files 419 421 +2
Lines 49208 49227 +19
===================================================
Hits 36607 36607
- Misses 10211 10225 +14
- Partials 2390 2395 +5 ☔ View full report in Codecov by Sentry. |
processor/processor.go
Outdated
@@ -2154,6 +2165,7 @@ func (proc *Handle) transformations(partition string, in *transformationMessage) | |||
} | |||
|
|||
type storeMessage struct { | |||
gatewayJobs []*jobsdb.JobT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern. By passing gatewayJobs
to the rest of processor stages we won't be able to GC sooner. This could lead to increase memory usage, under some situations.
Alternatively:
- we can pass the computed HLLs instead, but I understand the additional complexity in the code
- store to DB on the first stage of pipeline, breaking the atomicity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @lvrach on this. We can explore computing hll in processJobsForDest
stage. similar to some reporting metrics that are being computed there. and pass the computed value to the store layer and store this in a transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went ahead with passing the computed HLLs, tried to think of some other approach as well but couldn't find any better approach
9c9108f
to
1fb9752
Compare
base pr: #4826
Description
push events to uniques users data collector, data collector will create hlls and save it in the database. Later reporting service will read the data from the DB, aggregate it and send it to reporting system
Linear Ticket
pipe-1201
Security