A banking company uses an application to collect large volumes of transactional data. The company uses Amazon

Question

A banking company uses an application to collect large volumes of transactional data. The company uses Amazon Kinesis Data Streams for real-time analytics. The company’s application uses the PutRecord action to send data to Kinesis Data Streams.

A data engineer has observed network outages during certain times of day. The data engineer wants to configure exactly-once delivery for the entire processing pipeline.

Which solution will meet this requirement?

Accepted Answer

A. Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source.

Answer

B. Update the checkpoint configuration of the Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) data collection application to avoid duplicate processing of events.

Answer

C. Design the data source so events are not ingested into Kinesis Data Streams multiple times.

Answer

D. Stop using Kinesis Data Streams. Use Amazon EMR instead. Use Apache Flink and Apache Spark Streaming in Amazon EMR.

Q99 — AWS DEA-C01 Ch.1

Correct Answer: A. Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source.

Explanation