Q21 — AWS SAP-C02 Ch.1

Question 21 of 75 | ← Chapter 1

Q96. A company is running an application in the AWS Cloud. The application collects and stores alarge amount of unstructured data in an Amazon S3 bucket. The S3 bucket contains several terabytes of data and uses the S3 Standard storage class. The data increases in size by several gigabytes every day. The company needs to query and analyze the data. The company does not access data that is more than 1 year old. However, the company must retain all the data indefinitely for compliance reasons. Which solution will meet these requirements MOST cost-effectively?

Correct Answer: C. Use an AWS Glue Data Catalog and Amazon Athena to query the data. Create an S3 Lifecycle policy to transition data that is more than 1 year old to S3 Glacier Deep Archive.

Explanation

To meet the requirements of querying and analyzing a large amount of unstructured data stored in an Amazon S3 bucket while ensuring cost-effectiveness, the following solution can be implemented: C. Use an AWS Glue Data Catalog and Amazon Athena to query the data. Create an S3 Lifecycle policy to transition data that is more than 1 year old to S3 Glacier Deep Archive. By using an AWS Glue Data Catalog, metadata about the data in the S3 bucket can be cataloged, making it easier to query and analyze the data using Amazon Athena. Amazon Athena is a serverless query service that allows you to run SQL queries directly against data stored in S3. Creating an S3 Lifecycle policy to transition data that is more than 1 year old to S3 Glacier Deep Archive helps in reducing storage costs. S3 Glacier Deep Archive is the least expensive storage option in S3 and is suitable for long-term archival storage. This solution provides cost-effectiveness by leveraging serverless query capabilities with Amazon Athena and reducing storage costs by transitioning older data to S3 Glacier Deep Archive. Therefore, the solution that meets the requirements most cost-effectively is to use an AWS Glue Data Catalog and Amazon Athena to query the data, and create an S3 Lifecycle policy to transition data that is more than 1 year old to S3 Glacier Deep Archive.