Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitProvenance is very slow getting committers on a large Git history #4210

Open
Bananeweizen opened this issue May 23, 2024 · 2 comments
Open
Labels
bug Something isn't working performance

Comments

@Bananeweizen
Copy link
Contributor

What version of OpenRewrite are you using?

I am using

  • Maven plugin 5.32.0

What did you see instead?

After the following line of output

[INFO] Validating active recipes...

there is a delay of several minutes when I run locally. When I noticed this, I attached a profiler. The GitProvenance calculation of the committers takes forever. In the following screenshot that consumes around 7 minutes (after I noticed the delay, so add some minutes).
grafik
The git repo has 67K commits and 63K committed files (currently). I'm on Windows.

Maybe that information is useful when running on moderne.io infrastructure, but for local execution of most recipes that's probably not useful and should be improved.

@Bananeweizen Bananeweizen added the bug Something isn't working label May 23, 2024
@timtebeek
Copy link
Contributor

Oh wow; picture really speaks volumes here; thanks for including that! The getCommitters(Repository) was added last year

And is called from this group of methods.

/**
* @param projectDir The project directory.
* @param environment In detached head scenarios, the branch is best
* determined from a {@link BuildEnvironment} marker if possible.
* @return A marker containing git provenance information.
*/
@Nullable
public static GitProvenance fromProjectDirectory(Path projectDir, @Nullable BuildEnvironment environment) {
if (environment != null) {
if (environment instanceof JenkinsBuildEnvironment) {
JenkinsBuildEnvironment jenkinsBuildEnvironment = (JenkinsBuildEnvironment) environment;
try (Repository repository = new RepositoryBuilder().findGitDir(projectDir.toFile()).build()) {
String branch = jenkinsBuildEnvironment.getLocalBranch() != null
? jenkinsBuildEnvironment.getLocalBranch()
: localBranchName(repository, jenkinsBuildEnvironment.getBranch());
return fromGitConfig(repository, branch, getChangeset(repository));
} catch (IllegalArgumentException | GitAPIException e) {
// Silently ignore if the project directory is not a git repository
printRequireGitDirOrWorkTreeException(e);
return null;
} catch (IOException e) {
throw new UncheckedIOException(e);
}
} else {
File gitDir = new RepositoryBuilder().findGitDir(projectDir.toFile()).getGitDir();
if (gitDir != null && gitDir.exists()) {
//it has been cloned with --depth > 0
return fromGitConfig(projectDir);
} else {
//there is not .git config
try {
return environment.buildGitProvenance();
} catch (IncompleteGitConfigException e) {
return fromGitConfig(projectDir);
}
}
}
} else {
return fromGitConfig(projectDir);
}
}
private static void printRequireGitDirOrWorkTreeException(Exception e) {
if (!"requireGitDirOrWorkTree".equals(e.getStackTrace()[0].getMethodName())) {
e.printStackTrace();
}
}
@Nullable
private static GitProvenance fromGitConfig(Path projectDir) {
String branch = null;
try (Repository repository = new RepositoryBuilder().findGitDir(projectDir.toFile()).build()) {
String changeset = getChangeset(repository);
if (!repository.getBranch().equals(changeset)) {
branch = repository.getBranch();
}
return fromGitConfig(repository, branch, changeset);
} catch (IllegalArgumentException e) {
printRequireGitDirOrWorkTreeException(e);
return null;
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private static GitProvenance fromGitConfig(Repository repository, @Nullable String branch, @Nullable String changeset) {
if (branch == null) {
branch = resolveBranchFromGitConfig(repository);
}
return new GitProvenance(randomId(), getOrigin(repository), branch, changeset,
getAutocrlf(repository), getEOF(repository),
getCommitters(repository));
}

I agree with you that we should do something to improve performance here, or add a cut off past which we stop looking at committers; say a couple months. Either way thanks for calling this out!

@timtebeek timtebeek changed the title GitProvenance is very slow GitProvenance is very slow getting committers on a large Git history May 23, 2024
@timtebeek
Copy link
Contributor

Figured provide some more context: both the Maven plugin and Gradle plugin call the same method in openrewrite/rewrite to determine the GitProvenance. We're considering adding a 90 day committer history look back cut off to limit how many commits are evaluated while still having some committer history available to recipes. Not yet saying we'll definitely land on that, but figured give you that update such that you know we're working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance
Projects
Status: Backlog
Development

No branches or pull requests

2 participants