Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of api calls when large pipeline w/ many when OR statements #19710

Open
drewmiranda-gl opened this issue Jun 21, 2024 · 4 comments
Labels

Comments

@drewmiranda-gl
Copy link
Member

Poor performance of /api/system/pipelines/rule if a large pipeline rule exists

This appears to affect both

  • /api/system/pipelines/rule
  • /api/system/pipelines/rule/<pipeline rule id>

There appears to be an upper limit for how many characters a graylog pipeline rule can be. Beyond this limit performance is very bad (will provide comparison further down). Its no clear what this limit is.

For the purposes of this issue i'm calling a mega rule the test rule i had that is 132,615 characters.

Load times of the above api endpoints:

Scenario Load Time
No mega rule, many other pipeline rules present 136ms
Mega rule created 11s
Mega rule split into 2 rules, rule 1 of 2 created 1.81s
Mega rule split into 2 rules, both rules created 3.14s

The second and fourth scenarios above both load the exact total character/byte count but splitting the rules to where no one rule is >X size seems to significantly improve performance (11s vs 3.14s)

I'm not sure if it matters that the rules being split keeps them under 64k.

One last thing to note is that the graylog frontend is VERY good at not reissing these API requets as you navigate throughout the product. This behavior is only present on page loads/cold start of page load (refreshing the browser or opening a link in a new tab can also simulate the cold start)

For what its worth this appears to be CPU bound on the graylog-server side. Throwing more cpu at it likely will improve load times. My question or issue isn't so much the load times its that there is such a big difference between the mega rule vs the same rule split into 2 rules.

Expected Behavior

  • Interacting with pipelines and pipeline rules is performant (if i had to pick a number <3s load time)
  • If there is a limit for pipeline rule character count we should provide a warning in the UX and possibly disallow

Current Behavior

  • Interacting with pipelines and pipeline rules is VERY slow if a pipeline rule is "too large" (Unclear what this threshold is)

Possible Solution

Steps to Reproduce (for bugs)

See above but please let me know if there are any questions.

Context

Related to a customer. Lets discuss internally.

Your Environment

  • Graylog Version: 6.0.3
  • Java Version: Bundled
  • OpenSearch Version: 2.12.0
  • MongoDB Version: 6.0.15
  • Operating System: Ubuntu Server 22.04 LTS
  • Browser version: Version 126.0.6478.63 (Official Build) (arm64)
@drewmiranda-gl
Copy link
Member Author

New data point: replacing all occurrences of has_field with true in the when section performs significantly better despite the total number of characters for the pipeline rule still being quite large (e.g. 75k+)

@tellistone
Copy link

tellistone commented Jun 24, 2024

Established that the slow load time is associated with the when clauses here (hundreds of OR statements). Went from load time of 11s+ to near immediate.

Example:

|| has_field("AFieldName")

We might establish why this performs so poorly compared to an equal number of when statements:

rename_field("FieldName", "NewFieldName")

@tellistone tellistone changed the title Poor performance of /api/system/pipelines/rule if a large pipeline rule exists Poor performance of || has_field("AFieldName") Jun 24, 2024
@drewmiranda-gl
Copy link
Member Author

I wanted to clarify that my findings have to do with load times of API endpionts that load the pipeline rules and not related (or rather i did not test) performance of pipeline throughput or processing times. I find the new name of this issue a big ambiguous so wanted to clarify.

I also don't know if this poor API endpoint performance has to do specifically with the use of a bunch of OR has_field statements or if the same thing happens for any large when section of a pipeline rule.

@tellistone tellistone changed the title Poor performance of || has_field("AFieldName") Poor performance of api calls when large pipeline w/ many when OR statements Jun 25, 2024
@tellistone
Copy link

Fair points, I've adjusted the title

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants