-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws_kinesis_stream sink batching is not working #20575
Comments
I think there may be some, understandable, confusion here. The Can you share more about your use-case for putting multiple events into a single Kinesis record? |
hi! im not trying to batch them into a single kinesis event. The line i added in the vector source indicates that the putRecords only put 1 vector event instead of 500.
i think that the issue might be in this function: when using:
|
Oh I see, thanks for the additional detail @romkimchi1002 ! It does look like there is a bug here then; likely with the partition key as you noted. |
thanks! |
I hereby confirm that this problem also exists for the [sinks.firehose_al]
type = "aws_kinesis_firehose"
inputs = ["input"]
stream_name = "firehose-stream-name"
encoding.codec = "json"
partition_key_field = "hostname" # Workaround for https://github.com/vectordotdev/vector/issues/20575 |
…ctordotdev#1407 Send batches to AWS Kinesis Data Streams and AWS Firehose independent of ther parition keys. In both API's batches of events do not need to share the same partition key. This makes the protocol more efficient, as by default the partition key is a random key being different for every event.
…ctordotdev#1407 Send batches to AWS Kinesis Data Streams and AWS Firehose independent of ther parition keys. In both API's batches of events do not need to share the same partition key. This makes the protocol more efficient, as by default the partition key is a random key being different for every event.
A note for the community
Problem
records are not batched together before being sent to kinesis. each record is sent individually to kinesis
Configuration
Version
vector 0.38.0 (aarch64-apple-darwin)
Debug Output
No response
Example Data
i tried debugging it myself and downloded vecror as a source.
i added to this file
vector/src/sinks/aws_kinesis/streams/record.rs
this print line:
this line always return 1 no matter what configuration i'm using.
after modifying the code it seems that the partition_key is used for the batching, which is a bug since partition key should be a field used by the kinesis for spreading data across shards.
Additional Context
No response
References
No response
The text was updated successfully, but these errors were encountered: