Skip to content

Web application for information extraction and named entity recognition for PDF files (work-in-progress).

License

Notifications You must be signed in to change notification settings

fer-aguirre/pdf-2-ner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

85882cf · Feb 9, 2023

History

13 Commits
Jan 26, 2023
Feb 9, 2023
Jan 20, 2023
Jan 20, 2023
Feb 9, 2023
Jan 20, 2023
Jan 26, 2023
Jan 26, 2023
Jan 20, 2023
Feb 9, 2023
Feb 9, 2023
Feb 9, 2023
Jan 26, 2023
Jan 26, 2023
Jan 20, 2023

Repository files navigation

PDF 2 NER

Web application to convert scanned PDF files to text-based data and apply Named Entity Recognition (NER) to extract entities in Spanish

Created by: Fer Aguirre

Directory Structure

├── app.py
├── assets
│   └── pdfs
├── config.ini
├── config.ini.secret
├── data
│   ├── processed
│   └── raw
├── docs
│   ├── data-dictionary.md
│   ├── explore-data.md
│   ├── references
│   └── reports
├── LICENSE
├── notebooks
│   ├── 0.0-testing-nlp-models.ipynb
│   ├── 1.0-scraping-data.ipynb
│   └── 2.0-analyzing-data.ipynb
├── outputs
│   ├── figures
│   └── tables
├── pdf_2_ner
│   ├── data
│   ├── __init__.py
│   └── utils
├── Pipfile
├── Pipfile.lock
├── README.md
└── setup.py

License

This project is released under MIT License.

About

Web application for information extraction and named entity recognition for PDF files (work-in-progress).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published