Issues with other templates on [Windows] #566

abhigyanasatpathy · 2024-08-26T18:59:36Z

Steps to add new template

To add a new template, we recommend this workflow:

1. Copy existing template to new file

Find a template that is roughly similar to what you need and copy it to
a new file. It's good practice to use reverse domain notation. E.g.
country.company.division.language.yml or
fr.mobile.enterprise.french.yml. Language is not always needed.
Template folder are searched recursively for files ending in .yml.

2. Change invoice issuer

Just used in the output. Best to use the company name.

3. Set keyword

Look at the invoice and find the best identifying string. Tax number +
company name are good options. Remember, all keywords need to be found
for the template to be used.

Keywords are compared before processing the extracted text.

4. First test run

Now we're ready to see how far we are off. Run invoice2data with the
following debug command to see if your keywords match and how much work
is needed for dates, etc.

invoice2data --template-folder tpl --debug invoice-XXX.pdf

This test run shows you how the program will "see" the text in the
invoice. Parsing PDFs is sometimes a bit unpredictable. Also make sure
your template is used. You should already receive some data from static
fields or currencies.

5. Add regular expressions

Now you can use the debugging text to add regex fields for the
information you need. It's a good idea to copy parts of the text
directly from the debug output and then replace the dynamic parts with
regex. Keep in mind that some characters need escaping. To test, re-run
the above command.

date field: First capture the date. Then see if dateparser
handles it correctly. If not, add your format or language under
options.
amount: Capture the number without currency code. If you expect
high amounts, replace the thousand separator. Currently we don't
parse numbers via locals (TODO)

6. Done

Now you're ready to commit and push your template, so others get a
chance to use and improve it.

My Question:
I have added new template in yml with regex accordingly but when i am parsing that invoice pdf it is not parsing showing error .

Error message:
(invoice2data-env) D:\invoice2data-master\src\invoice2data>invoice2data --output-format csv --output-name output/invoices.csv input/demoinvoice.pdf
←[94mINFO:←[0minvoice2data.extract.loader:←[94m Loaded 189 templates from D:\invoice2data-master\invoice2data-env\Lib\site-packages\invoice2data\extract\templates←[0m
←[94mINFO:←[0mpikepdf._core:←[94m pikepdf C++ to Python logger bridge initialized←[0m
Scanning contents ---------------------------------------- 100% 1/1 0:00:00
←[1;43mWARNING:←[0mocrmypdf._pipeline:←[1;43m This PDF is marked as a Tagged PDF. This often indicates that the PDF was generated from an office document and does not need OCR. PDF pages processed by OCRmyPDF may not be tagged correctly.←[0m
OCR ---------------------------------------- 0% 0/1 -:--:--←[1;43mWARNING:←[0mocrmypdf._pipeline:←[1;43m Weighted average image DPI is 152.1, max DPI is 247.7. The discrepancy may indicate a high detail region on this page, but could also indicate a problem with the input PDF file. Page image will be rendered at 400.0 DPI.←[0m
OCR ---------------------------------------- 100% 1/1 0:00:00
Linearizing ---------------------------------------- 100% 100/100 0:00:00
←[94mINFO:←[0minvoice2data.input.ocrmypdf:←[94m Text extraction made with ocrmypdf←[0m
←[1;41mERROR:←[0mroot:←[1;41m No template for input/demoinvoice.pdf←[0m

The text was updated successfully, but these errors were encountered:

bosd · 2024-08-27T03:39:08Z

Hi,
Your steps for adding a template are correct.

Did you verify your installation of invoice2data is running properly, by testing I on one of the example files?

abhigyanasatpathy · 2024-08-27T11:00:28Z

Yes it is running properly.
Thank you for cooperating me.
Btw can you please tell me the process again?
I have created templates/myinvoice and inside it in.myinvoice.yml and regex according to my pdf .
So is that the process enough to convert my pdf to csv in output?
Or any other process or code i need to add , please tell me simply?
I have already run your existing template working fine.

bosd · 2024-08-27T12:11:25Z

Your invoked command seems ok.

Some debugging steps
[x] Verify your installation and parsing of sample file.
[ ] Run with --debug flag to check the output of the invoice-xx.pdf file.
This likely is the problem. As invoice2data trys to fall back on ocrmypdf. Which is likley due to the fact that it cannot detect characters with pdftotext.

Is your pdf file a text based file? or does it need ocr?
[ ] Try your pdf with different input parser --input-reader= then use pdftotext or ocrmypdf
[ ] Check your template for syntax errors

abhigyanasatpathy · 2024-08-27T20:00:03Z

My pdf file is text based file.
I have only created one file in.invoicedemo.yml (path: D:\invoice2data-master\src\invoice2data\extract\templates\in\in.invoicedemo.yml) as step-1
Should i proceed only with this process step-1 or any other steps i should follow?
Is there any other steps where i need to code or whatever else?

So in in.invoicedemo.yml file i have woked on regex expressions and keywords according to my pdf .

bosd · 2024-08-28T05:56:39Z

When you run invoice2data on the pdf file with the --debug flag, do you see the contents of the file in your logger/terminal?

abhigyanasatpathy · 2024-08-28T07:28:09Z

No , i cannot see contents of the file.
I can see only pdf to text data in logger (using --debug flag)
But i cannot see data in csv file .
Getting error in logger:
♀←[0m
DEBUG:←[0mroot: END pdftotext result =============================←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: au.com.opal.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: au.com.telstra.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.accor.invest.ibis.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.accor.invest.novotel.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.boucherie.pochet.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.cebeo.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.eg_retail.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.lampiris.facture-dacompte.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.lampiris.factuur.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.lampiris.regularisation.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.melchior-vins.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.proximus.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.scarlet.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.securex.social.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: ch.pcengines.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invo
.
.
.DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.bmw-fs.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.insert.subiekt-gt.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.insert.subiekt-nexo.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.orlen.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.p4.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.paypro.yml | Failed to match all keywords.←[0m
←[94mINFO:←[0mpikepdf._core:←[94m pikepdf C++ to Python logger bridge initialized←[0m
DEBUG:←[0mroot: Text extraction failed, falling back to ocrmypdf←[0m
DEBUG:←[0mroot: Text extraction failed, falling back to ocrmypdf←[0m
DEBUG:←[0minvoice2data.input.ocrmypdf: input_reader_config received from main are, *{}*←[0m
DEBUG:←[0minvoice2data.input.ocrmypdf: ocrmypdf config settings are: *{'redo_ocr': True, 'optimize': 0, 'output_type': 'pdf', 'fast_web_view': 0}*←[0m

←[1;43mWARNING:←[0mocrmypdf._pipeline:←[1;43m This PDF is marked as a Tagged PDF. This often indicates that the PDF was generated from an office document and does not need OCR. PDF pages processed by OCRmyPDF may not be tagged correctly.←[0m
OCR ---------------------------------------- 0% 0/1 -:--:--←[1;43mWARNING:←[0mocrmypdf._pipeline:←[1;43m Weighted average image DPI is 152.1, max DPI is 247.7. The discrepancy may indicate a high detail region on this page, but could also indicate a problem with the input PDF file. Page image will be rendered at 400.0 DPI.←[0m
OCR ---------------------------------------- 100% 1/1 0:00:00
Linearizing ---------------------------------------- 100% 100/100 0:00:00
←[94mINFO:←[0minvoice2data.input.ocrmypdf:←[94m Text extraction made with ocrmypdf←[0m
DEBUG:←

bosd · 2024-08-28T08:31:15Z

The result from pdftotext is empty.

So you're likely running into dependency issues from pdftotext / poppler utils on windows.
Currently windows is not well supported and tested.

There is an open pr to enhance support. But tests are failling.
#565

I'm a linux user. So cannot give you a lot of support on windows.

abhigyanasatpathy · 2024-08-28T10:31:56Z

But existing templates are working fine .
I am not able to extract my pdf data.

There is one file :
path: D:\invoice2data-master\invoice2data-env\Lib\site-packages\invoice2data-0.4.5.dist-info\RECORD
should i need to do anything with this file for new templates? or i need to just create templates?

bosd · 2024-08-28T11:24:51Z

Just creating the templates should be fine.

Let's check if the template you have created has been loaded.

Do you see your template in the list of loaded templates?

abhigyanasatpathy · 2024-08-28T19:37:47Z

Loaded templates meaning ? -- D:\invoice2data-master\src\invoice2data\extract\templates\in\in.demovoice.yml -- this one i can see..

But not able to see here:
DEBUG:←[0mroot: END pdftotext result =============================←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: au.com.opal.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: au.com.telstra.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.accor.invest.ibis.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.accor.invest.novotel.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.boucherie.pochet.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.cebeo.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.eg_retail.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.lampiris.facture-dacompte.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.lampiris.factuur.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.lampiris.regularisation.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.melchior-vins.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.proximus.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.scarlet.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: be.securex.social.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: ch.pcengines.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.AzureInterior.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.amazon.aws.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.apple.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.apps4rent.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.binarylife.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.bloomberg.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.cloudns.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.datadoghq.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.digitalocean.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.envato.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.expressvpn.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.expressvpn_prio6.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.ftserussell.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.github.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.globalsign.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.google.adwords.hk.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.hobohost.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.jamiepro.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.linode.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.microsoftonline.hk-v2017.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.microsoftonline.hk.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.mongodb.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.namecheap.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.namesilo.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.newrelic.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.nl.lenovo.digitalriver.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.nmmn.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.nodisto.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.nyse.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.oyo.invoice.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.packtpub.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.pixartprinting.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.sammymaystone.yml | Keywords matched. No exclude keywords found.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.scaleway.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.textmaster.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.tmx.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.travis-ci.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.twitter.de.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.twitter.uk.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.twitter.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.upwork.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.usersnap.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.amazon.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.bettina-kast.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.digikey.com.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.hosteurope.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.notebooksbilligerBillPay.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.ovh.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.qualityhosting.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: de.united-domains.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.pepephone.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: es.supplies24.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: co.mooncard.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.adobe.ie.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.akretion.fr.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.amazon.aws.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.ateliercopieservice.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.chauffeur-prive.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.coriolis.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.easyjet.fr.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.eaudugrandlyon.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.godaddy.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.google.ie.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.hootsuite.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.jeanbesson.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.ldlc.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.linkedin.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.mention.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.microsoft.ie.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.myflyingbox.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.officetimeline.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.orange-business.mobile.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.ovh.fr.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.rs-online.fr.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.saur.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.soyoustart.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: com.vinci-autoroutes.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: dolibarr.generique.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: eu.trainline.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.actn.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.airfrance.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.also.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.amazon.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.assurance-epargne-pension.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.bouyguestelecom.adsl-fiber.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.bouyguestelecom.mobile.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.butagaz.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.chronopost.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.dirafi.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.domaine-achat.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.easytrip.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.edf.entreprises.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.edf.pme.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.finagaz.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.fountain.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.free.adsl-fiber.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.free.mobile.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.free.mobile2.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.futur.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.ge-iroise.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.greffe-tc-lyon.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.hiscox.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.internetsatellite.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.jpg.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.kubii.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.laposte.boutique.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.laposte.coliposte.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.lecab.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.leroymerlin.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.maaf.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.mediapart.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.moneo-resto.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.mouser.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.mycelium-roulement.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.napsis.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.nexity.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.orange.fibre.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.orange.fixedline.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.prestaclic.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.publicationannoncelegale.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.sfr.adsl-fiber.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.sfr.mobile.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.sosh.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.teledec.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: fr.topoffice.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: net.online.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: net.scaleway.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.action.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.albron.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.anwb.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.be.coolblue.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.begra.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.blokker.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.bouwmans.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.bp.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.bunq.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.cpe.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.esso_eg_services.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.esso_eg_services_v2.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.farnell.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.ferbox.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.gamma.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.goos.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.gulf.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.ipparking.paleiskwartier.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.karwei.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.kav.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.koffiehenk.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.momentsenmore.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.ns.invoice.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.ok.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.parkmobile.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.praxis.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.reclameland.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.saeco.philips.eluscious.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.shell_nederland.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.shell_schellenkens.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.simpel.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.total_express.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.total_ototol.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.transip.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.tuynder.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.vistaprint.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.vodafone.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.wasco.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.weid.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.yezzer.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: nl.zinkunie.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.bmw-fs.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.insert.subiekt-gt.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.insert.subiekt-nexo.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.orlen.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.p4.yml | Failed to match all keywords.←[0m
DEBUG:←[0minvoice2data.extract.invoice_template: Template: pl.paypro.yml | Failed to match all keywords.←[0m

Why?
so asked i just only created yml file and my regex inside template folder ..

So is there anything i need to follow up ?

bosd · 2024-08-29T05:10:37Z

Why?

Because you need to check if the template you have created is properly loaded.

Check if your pointing to the correct folder.
(You can disable the built in templates with the following flag to reduce the noise: --exclude-built-in-templates)

You should see your template in that list.
If your template is correct is should say that the keywords have matched..
followed by a.. using template <your template file>

abhigyanasatpathy · 2024-08-29T07:18:52Z

Even after i deleted my templates still it is parsing existing pdf .
How's it possible?
Deactivated again activate it though.

bosd · 2024-08-29T07:24:57Z

You have to verify if your template is being loaded.

Are you pointing to the correct folder?
Is your custom template loaded? Or does the debugger show that there is an error in your template?
Is your template selected? Do the keywords match?

abhigyanasatpathy · 2024-08-29T13:39:55Z

Are you pointing to the correct folder? -- yes
Is your custom template loaded? Or does the debugger show that there is an error in your template? yes error showing
Is your template selected? Do the keywords match? yes checking

But not able to understand when i deleted existing templates for my test purpose, still its working , so i have doubt how is it possible?
From where it is matching keywords it should show that yml file not available but still showing after deleting (for my test purpose)

bosd · 2024-08-29T14:21:48Z

\ But not able to understand when i deleted existing templates for my test purpose, still its working , so i have doubt how is it possible?

That sounds like a folder issue.

Maybe it is installed in different versions or locations.

What is the path which shows when you do
'pip show invoice2data'?

Is that the same location as where you where deleting the files?

abhigyanasatpathy · 2024-08-29T19:06:17Z

abhigyanasatpathy · 2024-08-29T19:07:21Z

My template location path is :
D:\invoice2data-master\src\invoice2data\extract\templates
Is it okay?

bosd · 2024-08-29T19:55:35Z

No, because your standard templates are loaded from the directory in the screenshot.

For easy testing gi to that location and delete the standard templates there. Or add your own custom ones there.

bosd changed the title ~~Issues with other templates~~ Issues with other templates on [Windows] Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with other templates on [Windows] #566

Issues with other templates on [Windows] #566

abhigyanasatpathy commented Aug 26, 2024

bosd commented Aug 27, 2024

abhigyanasatpathy commented Aug 27, 2024

bosd commented Aug 27, 2024

abhigyanasatpathy commented Aug 27, 2024 •

edited

Loading

bosd commented Aug 28, 2024

abhigyanasatpathy commented Aug 28, 2024

bosd commented Aug 28, 2024

abhigyanasatpathy commented Aug 28, 2024 •

edited

Loading

bosd commented Aug 28, 2024

abhigyanasatpathy commented Aug 28, 2024

bosd commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

bosd commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

bosd commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

bosd commented Aug 29, 2024

Issues with other templates on [Windows] #566

Issues with other templates on [Windows] #566

Comments

abhigyanasatpathy commented Aug 26, 2024

Steps to add new template

1. Copy existing template to new file

2. Change invoice issuer

3. Set keyword

4. First test run

5. Add regular expressions

6. Done

bosd commented Aug 27, 2024

abhigyanasatpathy commented Aug 27, 2024

bosd commented Aug 27, 2024

abhigyanasatpathy commented Aug 27, 2024 • edited Loading

bosd commented Aug 28, 2024

abhigyanasatpathy commented Aug 28, 2024

bosd commented Aug 28, 2024

abhigyanasatpathy commented Aug 28, 2024 • edited Loading

bosd commented Aug 28, 2024

abhigyanasatpathy commented Aug 28, 2024

bosd commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

bosd commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

bosd commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

abhigyanasatpathy commented Aug 29, 2024

bosd commented Aug 29, 2024

abhigyanasatpathy commented Aug 27, 2024 •

edited

Loading

abhigyanasatpathy commented Aug 28, 2024 •

edited

Loading