Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing not finishing / updating the document in paperless-ngx #49

Closed
joerg-hermanns opened this issue Jan 7, 2025 · 7 comments
Closed
Assignees
Labels
done Issue closed with success investigating Investigating an issue

Comments

@joerg-hermanns
Copy link

Describe the bug
I managed to setup your project fine as it seems.
Can access the web gui and linked to paperless-ngx and openai account.
Access to paperless seems to be fine as i can see the document count, tags and so on.
Access to openai seems to be fine as the "manual" processing (almost) works.

When processing the documents, paperless-ai finds documents to be processed and sends them to OpenAI.
Processing seems to be fine, but nothing happens in paperless-ngx - the documents are still unmodified after processing.

To Reproduce

Consume new document in paperless
Add tag "openai" to document
Wait for paperless-ai to scan and process documents
Refresh paperless-ngx gui
Nothing has changed there

Doing manual classification seems to work fine.
Everything gets detected
But on saving back to paperless-ngx it seems the correspondant is not saved correctl.y
It always shows "Private" there ...

Expected behavior

Background processing should work and write back changes to paperless ngx

Desktop (please complete the following information):

  • OS: Windows 11 on client / UNRAID (docker) on server
  • Browser: Edge
  • latest Version

Additional context

Screenshot 2025-01-07 132710 Screenshot 2025-01-07 132336 ![Screenshot 2025-01-07 132209](https://github.com/user-attachments/assets/8210ae13-6faf-4fe8-a74c-5596d99b70cd) Screenshot 2025-01-07 132121

Refreshing tag cache...
Tag cache refreshed. Found 15 tags.
Filtering documents for tags: [ 'openai' ]
Fetched page 1, got 5 matching documents. Total so far: 5
Fetched page 2, got 0 matching documents. Total so far: 5
Fetched page 3, got 0 matching documents. Total so far: 5
Fetched page 4, got 0 matching documents. Total so far: 5
Fetched page 5, got 0 matching documents. Total so far: 5
Fetched page 6, got 0 matching documents. Total so far: 5
Fetched page 7, got 0 matching documents. Total so far: 5
Fetched page 8, got 1 matching documents. Total so far: 6
Finished filtering. Found 6 documents matching the predefined tags.
Processing new document: ecodms_docid_0002760_revision_0001
[DEBUG] [07.01.25, 13:06] OpenAI request sent
No tags provided to processTags
[DEBUG] [07.01.25, 13:06] Used tokens: 238, Total tokens: 1175
Current tags for document 772: [ 16, 6 ]
Adding new tags: []
Combined tags: [ 16, 6 ]
Updated document 772 with: {
tags: [ 16, 6 ],
title: 'ecodms_docid_0002760_revision_0001',
created: '2023-08-02T22:00:00.000Z'
}
Document 772 added to processed_documents
Current config TAGS: [ 'openai' ]
Current config PROMPT_TAGS: []

@yaakovfeldman
Copy link

yaakovfeldman commented Jan 7, 2025

I have the same. In settings 'Process only specific pre tagged documents?' is Yes and tag is 'inbox'. Also 'Add AI-processed tag to documents?' is Yes and tag is 'ai-processed'.

I tried adding the inbox tag to a document as a test. When the cron job runs it picks it up (as I see in logs). But it doesn't seem to do anything to it: when I switch back to paperless-ngx I see no changes and the ai-processed tag has not been added.

I know the connection to OpenAI is fine as I see the response in the logs.

Thank you for creating this! And I'm happy to help debug.

@clusterzx
Copy link
Owner

One scenario could be that the tag you are using is created with or for another user. So there could be a permission problem.

@yaakovfeldman
Copy link

I only really have one user, the default Paperless, and I am using that users API token.

Should I manually create a tag called ai-processed? It doesn't exist yet.

@joerg-hermanns
Copy link
Author

I also have only one user - and the API token was generated for that user.
The "openai" tag was created manually in paperless - the "ai-processed" token was actually created by paperless-ai
Is there anything we can check here to provide more information for you?

@clusterzx
Copy link
Owner

No, then you both should be good to go (normally).
I have to say big sorry right now, I try to solve so many little bugs that popped up in the last 48hours :D
Be assured I have your issue on the watch and will get back to you.

Meanwhile you could test arround further maybe you discover some reasons why that happens.

@joerg-hermanns
Copy link
Author

I remove the tagging of processed documents (with "ai-processed") but that did not change anything.
Actually i am a bit hesitating on removing the inpout filter tag ("openai") because i fear that maybe then my already well tagged documents are incorrectly modified ... can i somehow limit the tool to ONLY process newly added documents without using the tag-filtering?

@clusterzx
Copy link
Owner

@joerg-hermanns Ah, maybe in your case I would not play arround as we can not assume what will happen in your edge case.
A good way to try would be quick setup a second paperless-ngx instance with your data and play as much you want.

I did that so many time in the last days :D

@clusterzx clusterzx self-assigned this Jan 7, 2025
@clusterzx clusterzx added the investigating Investigating an issue label Jan 7, 2025
clusterzx added a commit that referenced this issue Jan 8, 2025
Addressing Fixes and new Features:

Fixes:
#66
#61
#58
#55
#53
#45
#59
#52
#49
#31
#37
#52

Added:
- Big New Feature: Playground
	- Try your prompts on your documents and see how they perform. In Playground no data will be updated in Paperless.
- Added Code and Markdown interpretation in Chat Mode.
- Chat Mode now works with Ollama
@clusterzx clusterzx added the done Issue closed with success label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done Issue closed with success investigating Investigating an issue
Projects
None yet
Development

No branches or pull requests

3 participants