Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: Optimeize clustering compaction #34086

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

xiaocai2333
Copy link
Contributor

issue: #32939

@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label Jun 24, 2024
Copy link
Contributor

mergify bot commented Jun 24, 2024

@xiaocai2333

Invalid PR Title Format Detected

Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:

  1. Title Format: The PR title must begin with one of these prefixes:
  • feat: for introducing a new feature.
  • fix: for bug fixes.
  • enhance: for improvements to existing functionality.
  • test: for add tests to existing functionality.
  • doc: for modifying documentation.
  • auto: for the pull request from bot.
  1. Description Requirement: The PR must include a non-empty description, detailing the changes and their impact.

Required Title Structure:

[Type]: [Description of the PR]

Where Type is one of feat, fix, enhance, test or doc.

Example:

enhance: improve search performance significantly 

Please review and update your PR to comply with these guidelines.

@xiaocai2333
Copy link
Contributor Author

/hold

Copy link
Contributor

mergify bot commented Jun 24, 2024

@xiaocai2333 ut workflow job failed, comment rerun ut can trigger the job again.

Copy link

codecov bot commented Jun 24, 2024

Codecov Report

Attention: Patch coverage is 49.67742% with 78 lines in your changes missing coverage. Please review.

Project coverage is 80.84%. Comparing base (97db7be) to head (cf661e5).
Report is 2 commits behind head on master.

Current head cf661e5 differs from pull request most recent head 0d0a015

Please upload reports for the commit 0d0a015 to get more accurate results.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #34086      +/-   ##
==========================================
+ Coverage   71.58%   80.84%   +9.25%     
==========================================
  Files        1085     1085              
  Lines      137505   137584      +79     
==========================================
+ Hits        98439   111230   +12791     
+ Misses      34828    22127   -12701     
+ Partials     4238     4227      -11     
Files Coverage Δ
internal/datacoord/compaction.go 76.00% <100.00%> (+0.49%) ⬆️
internal/datanode/io/binlog_io.go 91.48% <100.00%> (+5.77%) ⬆️
pkg/util/paramtable/component_param.go 98.48% <100.00%> (ø)
...ternal/datanode/compaction/clustering_compactor.go 62.06% <44.68%> (-1.50%) ⬇️

... and 214 files with indirect coverage changes

Copy link
Contributor

mergify bot commented Jun 24, 2024

@xiaocai2333 ut workflow job failed, comment rerun ut can trigger the job again.

@@ -94,13 +94,16 @@ type clusteringCompactionTask struct {
// vector
segmentIDOffsetMapping map[int64]string
offsetToBufferFunc func(int64, []uint32) *ClusterBuffer

cm storage.ChunkManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems not used now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, will remove

@@ -112,14 +115,17 @@ type ClusterBuffer struct {
}

type SpillSignal struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let‘s rename it to FlushSignal, I miss it in the last PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,

t.clusterBufferLocks.Lock(signal.buffer.id)
defer t.clusterBufferLocks.Unlock(signal.buffer.id)
return t.flushBinlog(ctx, signal.buffer)
return t.flushBinlog(ctx, t.clusterBuffers[signal.id], signal.writer, signal.pack)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submit to pool as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

return err
}
err = t.packBufferToSegment(ctx, buffer)
writer := buffer.writer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submit to pool as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

if err != nil {
return err
func (t *clusteringCompactionTask) refreshBufferWriter(buffer *ClusterBuffer) (bool, error) {
var segmentID int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we spilt out a new method needPack? seems a little mix-together now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

mergify bot commented Jun 24, 2024

@xiaocai2333 ut workflow job failed, comment rerun ut can trigger the job again.

1 similar comment
Copy link
Contributor

mergify bot commented Jun 24, 2024

@xiaocai2333 ut workflow job failed, comment rerun ut can trigger the job again.

Copy link
Contributor

mergify bot commented Jun 24, 2024

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@xiaocai2333 xiaocai2333 force-pushed the major_profiling branch 4 times, most recently from 659124f to 1127a39 Compare June 26, 2024 07:51
@mergify mergify bot added ci-passed and removed ci-passed labels Jun 26, 2024
@mergify mergify bot added ci-passed and removed ci-passed labels Jun 26, 2024
@xiaocai2333 xiaocai2333 changed the title Major profiling enhance: Optimeize clustering compaction Jun 27, 2024
@mergify mergify bot added kind/enhancement Issues or changes related to enhancement ci-passed and removed ci-passed do-not-merge/invalid-pr-format labels Jun 27, 2024
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xiaocai2333
To complete the pull request process, please assign czs007 after the PR has been reviewed.
You can assign the PR to them by writing /assign @czs007 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify mergify bot removed the ci-passed label Jun 27, 2024
Copy link
Contributor

mergify bot commented Jun 27, 2024

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Signed-off-by: Cai Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-passed DCO check passed. do-not-merge/hold kind/enhancement Issues or changes related to enhancement size/L Denotes a PR that changes 100-499 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants