-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General
: Add science events to user data export
#8861
base: develop
Are you sure you want to change the base?
Conversation
Adds new DataExportScienceEventService. Ammends DataExportCreationService and ScienceEventRepository with necessary functionality
Tests if necessary CSV is present.
0d68c78
to
30bb32d
Compare
WalkthroughThe changes introduce a comprehensive system for exporting science events as part of user data. This includes repository and service modifications for handling science events, updates to export creation services for CSV generation, and scheduled tasks for data exports. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant DataExportCreationService
participant DataExportScienceEventService
participant ScienceEventRepository
participant CSVFile
User->>DataExportCreationService: Request data export
DataExportCreationService->>DataExportScienceEventService: Create science event export
DataExportScienceEventService->>ScienceEventRepository: findAllByIdentity(identity)
ScienceEventRepository-->>DataExportScienceEventService: Return ScienceEvents
DataExportScienceEventService->>CSVFile: Write ScienceEvents to CSV
CSVFile -->> User: Provide data export
Assessment against linked issues
Recent review detailsConfiguration used: .coderabbit.yaml Files selected for processing (1)
Additional context usedPath-based instructions (1)
Additional comments not posted (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
src/test/java/de/tum/in/www1/artemis/service/DataExportCreationServiceTest.java
Outdated
Show resolved
Hide resolved
src/test/java/de/tum/in/www1/artemis/service/DataExportCreationServiceTest.java
Outdated
Show resolved
Hide resolved
src/main/java/de/tum/in/www1/artemis/service/export/DataExportScienceEventService.java
Outdated
Show resolved
Hide resolved
src/test/java/de/tum/in/www1/artemis/exercise/ExerciseUtilService.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your test only test if there exists a science event file in the export, but it does not verify its content. Please also test that the events actually get correctly exported.
@@ -103,7 +103,7 @@ artemis: | |||
|
|||
scheduling: # these values are cron expressions. To generate one, you can use e.g. https://www.freeformatter.com/cron-expression-generator-quartz.html | |||
programming-exercises-cleanup-time: 0 0 3 * * * # every day at 3am | |||
data-export-creation-time: 0 0 4 * * * # every day at 4am | |||
data-export-creation-time: 0 * * ? * * # every minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to change this after testing.
c3aa0a2
to
8723a0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
src/test/java/de/tum/in/www1/artemis/service/DataExportCreationServiceTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code
There is no privacy policy deployed on the testservers. Therfore when trying to load the policy a not found error is thrown. The data export should still be available. |
src/test/java/de/tum/in/www1/artemis/science/ScienceUtilService.java
Outdated
Show resolved
Hide resolved
|
||
Duration d = Duration.between(e1.getTimestamp(), e2.getTimestamp()); | ||
System.out.println(d.getNano()); | ||
if (d.getNano() > 500 && d.getSeconds() >= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be simplified by using the metod d.toMillis()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working with millis is imo to broad of a timescale. During testing events are generated in a microsecond time frame. In case multiple scienceevents with the same type and resource occur in quick succession (wich i belive could also happen with real world data.) they would not be differentiated. Especially when rounding then leads to to big of a difference.
5efc998
to
24b8bd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ArchTest for Logging is failing
24b8bd8
to
dc180cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
This reverts commit f1e27ce.
Sets the cronjob directly in the DataExportScheduleService. Needs to be reverted before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
@@ -62,7 +62,7 @@ public DataExportScheduleService(DataExportRepository dataExportRepository, Data | |||
* Created will be all data exports that are in the state REQUESTED OR IN_CREATION | |||
* Deleted will be all data exports that have a creation date older than seven days | |||
*/ | |||
@Scheduled(cron = "${artemis.scheduling.data-export-creation-time: 0 0 4 * * *}") | |||
@Scheduled(cron = "0 * * ? * *") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest improvements in resource management and error handling.
The method uses an ExecutorService to manage concurrent tasks, which is a good practice. However, there are a few areas for improvement:
- Resource Management: The method should ensure that all resources, especially threads in the ExecutorService, are properly managed. For example, the executor should be shut down in a finally block to ensure it always happens, even if an exception is thrown.
- Error Handling: Exceptions thrown by the
createDataExport
method are not caught, which might lead to unhandled exceptions that could disrupt the scheduling. - Logging: More detailed logging could be beneficial, especially logging the exceptions, to provide better insights during troubleshooting.
- ExecutorService executor = Executors.newFixedThreadPool(10);
+ ExecutorService executor = Executors.newFixedThreadPool(10); // Consider configuring the thread pool size dynamically based on system properties.
try {
dataExportsToBeCreated.forEach(dataExport -> executor.execute(() -> createDataExport(dataExport, successfulDataExports)));
+ executor.shutdown();
+ if (!executor.awaitTermination(60, java.util.concurrent.TimeUnit.MINUTES)) {
+ log.warn("Executor did not terminate in the expected time.");
+ executor.shutdownNow();
+ }
+ } catch (Exception e) {
+ log.error("Error during data export creation.", e);
+ } finally {
+ if (!executor.isShutdown()) {
+ executor.shutdownNow();
+ }
}
- executor.shutdown();
- if (!executor.awaitTermination(60, java.util.concurrent.TimeUnit.MINUTES)) {
- log.info("Not all pending data exports could be created within 60 minutes.");
- executor.shutdownNow();
- }
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@Scheduled(cron = "0 * * ? * *") | |
ExecutorService executor = Executors.newFixedThreadPool(10); // Consider configuring the thread pool size dynamically based on system properties. | |
try { | |
dataExportsToBeCreated.forEach(dataExport -> executor.execute(() -> createDataExport(dataExport, successfulDataExports))); | |
executor.shutdown(); | |
if (!executor.awaitTermination(60, java.util.concurrent.TimeUnit.MINUTES)) { | |
log.warn("Executor did not terminate in the expected time."); | |
executor.shutdownNow(); | |
} | |
} catch (Exception e) { | |
log.error("Error during data export creation.", e); | |
} finally { | |
if (!executor.isShutdown()) { | |
executor.shutdownNow(); | |
} | |
} |
Checklist
General
Server
Motivation and Context
Due to GDPR regulations, all personal data has to be made available to the end user if requested. This was not the case for the science events collected.
Description
This PR adds a new
DataExportScienceEventService
responsible for exporting all science events associated with a user.All science events are exported as a single csv containing a timestamp, an event type and the associated resource.
Closes #8655
Steps for Testing
Prerequisites:
DataExportScheduleService
have to be commented out or removed.Data Export
Request Data Export
Request Data Export
Download data export
science_ecents.csv
coincide with the times noted down.Testserver States
Note
These badges show the state of the test servers.
Green = Currently available, Red = Currently locked
Click on the badges to get to the test servers.
Review Progress
Performance Review
Code Review
Manual Tests
Test Coverage
Summary by CodeRabbit
New Features
Bug Fixes
Tests
These updates improve data export capabilities and ensure timely data management for users.