A data engineer has a one-time task to read data from objects that are in Apache Parquet format in an Amazon S

Question

A data engineer has a one-time task to read data from objects that are in Apache Parquet format in an Amazon S3 bucket. The data engineer
Needs to query only one column of the data.
Which solution will meet these requirements with the LEAST operational overhead?

Accepted Answer

B. Use S3 Select to write a SQL SELECT statement to retrieve the required column from the S3 objects.

Answer

A. Con􀂬gure an AWS Lambda function to load data from the S3 bucket into a pandas dataframe. Write a SQL SELECT statement on the Dataframe to query the required column.

Answer

C. Prepare an AWS Glue DataBrew project to consume the S3 objects and to query the required column.

Answer

D. Run an AWS Glue crawler on the S3 objects. Use a SQL SELECT statement in Amazon Athena to query the required column.

Q38 — AWS DEA-C01 Ch.1

Correct Answer: B. Use S3 Select to write a SQL SELECT statement to retrieve the required column from the S3 objects.

Explanation