We need an Flink application (jar) that will run inside Kinesis Analytics environment. It needs to read JSON records from a Kinesis Data Stream, accumulate 5 minutes and then send it to an S3 bucket.
I will be very similar to AWS Kinesis Firehose, writing the events in batch to a S3 file in the GZIP format.
The most import requirement, which is the reason we can't use Firehose, it that the directory structure in the S3 Bucket must use an information that is inside the JSON record.
So the folder structure needs to be:
And eventType is at the ROOT path of the JSON object: