Q24 — AWS SAA-C03 Ch.5
Question 24 of 65 | ← Chapter 5
Q324. A company uses a legacy application to produce data in CSV format. The legacy application stores the output data in Amazon S3. The company is deploying a new commercial off-the-shelf (COTS) application that can perform complex SQL queries to analyze data that is stored in Amazon Redshift and Amazon S3 only. However, the COTS application cannot process the csv files that the legacy application produces. The company cannot update the legacy application to produce data in another format. The company needs to implement a solution so that the COTS application can use the data that the legacy application produces Which solution will meet these requirements with the LEAST operational overhead?
- A. Create an AWS Glue extract, transform, and load (ETL) job that runs on a schedule. Configure the ETL job to process the .csv files and store the processed data in Amazon Redshift ✓
- B. Develop a Python script that runs on Amazon EC2 instances to convert the csy files to .sgl files.Invoke the Python script on a cron schedule to store the output files in Amazon S3
- C. Create an AWS IAMbda function and an Amazon DynamoDB table. Use an S3 event to invoke the IAMbda function. Configure the IAMbda function to perform an extract. transform, and load (ETL) job to process the csv files and store the processed data in the DynamoDB table
- D. Use Amazon EventBridge to launch an Amazon EMR cluster on a weekly schedule. Configure the EMR cluster to perform an extract, transform, and load(ETL) job to process the .csv files and store the processed data in an Amazon Redshift table
Correct Answer: A. Create an AWS Glue extract, transform, and load (ETL) job that runs on a schedule. Configure the ETL job to process the .csv files and store the processed data in Amazon Redshift
Explanation
AWS Glue provides a fully managed extract, transform, and load (ETL) service that can run on a schedule to process CSV files and store the processed data in Amazon Redshift. This approach allows you to automate the conversion process without having to manually convert the files or manage any infrastructure. Option B, developing a Python script to convert the CSV files to SQL files running on Amazon EC2 instances, involves additional operational overhead in managing the instances, deploying the code, and maintaining it over time.Option C, using an AWS Lambda function and DynamoDB table, may not be efficient for large volumes of data, as it is primarily designed for small and simple tasks. Option D involves launching an EMR cluster, which can have additional overhead costs and maintenance compared to AWS Glue.