Inroduction

This project contains source code and supporting files for a serverless application that automatically extract text from scanned pdf files using AWS Textract.

Prerequisites

The following must be done before following this guide:

Setup an AWS account.
Configure the AWS CLI with user credentials.
Install AWS CLI.
AWS SAM CLI.
jq (optional).

Deployment

$sam deploy --capabilities CAPABILITY_NAMED_IAM --guided

Test

After you upload a pdf file into S3 bucket deployed, there will be a text file created automatically in the same bucket.

You can also call api gateway endpoint path /textract to get Textract result by job id. eg.

$curl  -d '{"jobId":"xxxxx2bd5ad43875edxxxx5aee29b65f273fxxxxx"}'  -H "Content-Type: application/json" https://xxxx.execute-api.ap-southeast-2.amazonaws.com/textract | jq '.'

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
events		events
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
response.json		response.json
samconfig.toml		samconfig.toml
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inroduction

Prerequisites

Deployment

Test

About

Releases

Packages

Languages

yai333/ImageTextExtractExample

Folders and files

Latest commit

History

Repository files navigation

Inroduction

Prerequisites

Deployment

Test

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages