[SUPPORT] The clean service can't clean historical version files after the savepoint instant when i set hoodie.archive.beyond.savepoint=true
#11405
Labels
hoodie.archive.beyond.savepoint=true
#11405
The clean service can't clean historical version files after the savepoint instant when i set
hoodie.archive.beyond.savepoint=true
To Reproduce
Expected behavior
old commit data should be cleaned up according to the clean policy.
Environment Description
Additional context
I found that in the
HoodieDefaultTimeline.getFirstNonSavepointCommit
method, 'savepointTimestamps" set is always empty, even though the savepoint instant already exist.this issue occurs because in the
CleanPlanner.getFilesToCleanKeepingLatestCommits
method, the call tofileSystemView.getAllFileGroups
retrieves all fileGroups in the partition path. however theHoodieTimeline
in HoodieFileGroup only matches the following actions:COMMIT_ACTION, DELTA_COMMIT_ACTION, COMPACTION_ACTION, LOG_COMPACTION_ACTION, REPLACE_COMMIT_ACTION
. Consequently, whengetFirstNonSavepointCommit
is called, it nerver returns the first instant beyond the savepoint instant. As a result, historical version files are nerver cleaned.CleanPlanner.getFilesToCleanKeepingLatestCommits -> fileSystemView.getAllFileGroups -> AbstractTableFileSystemView.addFilesToView -> this.visibleCommitsAndCompactionTimeline = visibleActiveTimeline.**getWriteTimeline** -> fileGroup.getAllFileSlices -> HoodieDefaultTimeline.getFirstNonSavepointCommit
The text was updated successfully, but these errors were encountered: