Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong Last Modified Time of S3 Object #681

Open
YunanJeong opened this issue Sep 11, 2023 · 1 comment
Open

Wrong Last Modified Time of S3 Object #681

YunanJeong opened this issue Sep 11, 2023 · 1 comment

Comments

@YunanJeong
Copy link

Hi there,

Version

S3 Sink Connector: 10.5.0
Kafka: 3.5.0

Problem

I uploaded data from Kafka to S3 every 30 minutes.
However, when the amount of data increases, the last modified time is wrong.
Moreover, Last modified time was "earlier than" actual upload time.
For example below,my files at 21:23:54, 21:24:29 have all data from 21:00:00~21:29:59

There are no issues with actual upload time and data integrity.
The only problem is the "last-modified time indicated in S3".

Anybody who know this issue? Thank you.

...
2023-09-10 17:30:01   24067022 my-data+0+0012998208.json.gz
2023-09-10 17:30:01   24804019 my-data+1+0013021328.json.gz
2023-09-10 18:00:01   25148397 my-data+0+0013081184.json.gz
2023-09-10 18:00:01   25295342 my-data+1+0013105757.json.gz
...
2023-09-10 21:23:54   33226385 my-data+1+0013762488.json.gz
2023-09-10 21:24:29   32369427 my-data+0+0013733998.json.gz
...
@YunanJeong
Copy link
Author

I found one more.
This problem occurs when the size of file is bigger than about 26MiB. The "last modified"is recorded 1 minute faster for every about 1MiB increase from 26MiB.
My Network bandwidth and computing power are sufficient. I doubt it is due to lack of topic's partitions. But, Where is the Criteria for selecting the number of partitions in terms of file size or transfer speed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant