Q16 — AWS SAP-C02 Ch.1

Question 16 of 75 | ← Chapter 1

Q91. A company has migrated its forms-processing application to AWS. When users interact with the application, they upload scanned forms as files through a web application. A database stores user metadata and references to files that are stored in Amazon S3. The web application runs on Amazon EC2 instances and an Amazon RDS for PostgreSQL database.When forms are uploaded, the application sends notifications to a team through Amazon Simple Notification Service (Amazon SNS).A team member then logs in and processes each form. The team member performs data validation on the form and extracts relevant data before entering the information into another system that uses an API.A solutions architect needs to automate the manual processing of the forms. The solution must provide accurate form extraction, minimize time to market, and minimize long-term operational overhead.Which solution will meet these requirements?

Correct Answer: D. Extend the system with an application tier that uses AWS Step Functions and AWS Lambda. Configure this tier to use Amazon Textract and Amazon Comprehend to perform optical character recognition (OCR) on the forms when forms are uploaded. Store the output in Amazon S3. Parse this output by extracting the data that is required within the application tier. Submit the data to the target system's API

Explanation

The solution that will meet the requirements of automating the manual processing of forms, providing accurate form extraction, minimizing time to market, and minimizing long-term operational overhead is: D. Extend the system with an application tier that uses AWS Step Functions and AWS Lambda. Configure this tier to use Amazon Textract and Amazon Comprehend to perform optical character recognition (OCR) on the forms when forms are uploaded. Store the output in Amazon S3. Parse this output by extracting the data that is required within the application tier. Submit the data to the target system's API. Explanation: Option D involves extending the system with an application tier that utilizes AWS Step Functions and AWS Lambda. This approach allows for easy automation of the manual processing of forms. By configuring the application tier to use Amazon Textract and Amazon Comprehend, the OCR and data extraction processes can be performed accurately and efficiently. Amazon Textract is specifically designed for OCR and form extraction, while Amazon Comprehend can be used to extract relevant data and perform natural language processing tasks. The solution stores the output of the OCR and data extraction processes in Amazon S3, which provides an efficient and scalable storage solution for the processed forms. The extracted data can then be parsed within the application tier to extract the required information. Finally, the data can be submitted to the target system's API for further processing. Option A is incorrect because developing custom libraries for OCR and deploying them to an Amazon EKS cluster would require significant development effort and ongoing operational overhead. It does not leverage existing AWS services for OCR and data extraction. Option B is incorrect because training and hosting AI/ML models on an EC2 instance for OCR would require significant development effort and ongoing maintenance. It does not utilize the specialized OCR capabilities provided by Amazon Textract. Option C is incorrect because calling AI/ML models hosted in Amazon SageMaker for OCR would introduce additional complexity and operational overhead. It is not necessary to use SageMaker for OCR when Amazon Textract provides specialized OCR capabilities. Therefore, the correct solution is D: Extend the system with an application tier that uses AWS Step Functions and AWS Lambda, configure it to use Amazon Textract and Amazon Comprehend for OCR and data extraction, store the output in Amazon S3, parse the output within the application tier, and submit the data to the target system's API.