Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paperless-ai making continuouesly Openai requests ? #31

Closed
kolle86 opened this issue Jan 5, 2025 · 28 comments
Closed

Paperless-ai making continuouesly Openai requests ? #31

kolle86 opened this issue Jan 5, 2025 · 28 comments
Assignees
Labels
done Issue closed with success

Comments

@kolle86
Copy link

kolle86 commented Jan 5, 2025

I Just noticed heavy use of my Openai API Key just for today. I updated to latest Image this morning. No new documents.
When i stop the Container, the requests stop. Paperless-ai is the only application using the API Key. How can this be related ?
Screenshot_20250105-141419

@clusterzx
Copy link
Owner

https://platform.openai.com/settings/organization/usage/activity

Can you share a screenshot of you activity dashboard? The gpt-4o-mini costs are low so I dont see constant usage.

@kolle86
Copy link
Author

kolle86 commented Jan 5, 2025

Just generated a new Key and started Container with that Key. 26 requests. Stopped Container - requests stopped.

Screenshot_20250105-153155

In the latest commits i See a Change from gpt4-mini to gpt4. Can this be related ?

@clusterzx
Copy link
Owner

Oh true, I see I defined gpt-4 instead of gpt-4o-mini, that should be fixed later.

So let me break this up a bit where and when requests are sent to OpenAI API.

  1. On every start of the container it performs a check to verify the saved settings and everythin works.
  2. On the setup process itself it also sends a request, also to check if API connection works.
  3. For every document that is processed there is one request sent.

Thats about it. Maybe you container restarts all the time (error based)?

@clusterzx
Copy link
Owner

image
This is an actual usage of my OpenAI API right now. As you can see the maximum I got in a day was 276 requests.
So 26 is not that much tbh.

@kolle86
Copy link
Author

kolle86 commented Jan 5, 2025

I let the Container Run again for about 5 minutes and it made over 10 requests without Processing any documents.
Container does not restart because of error

@clusterzx
Copy link
Owner

what is your SCAN_INTERVAL setting in the .env file in /app/data folder?

@kolle86
Copy link
Author

kolle86 commented Jan 5, 2025

Screenshot_20250105-163825

Default

@clusterzx
Copy link
Owner

How many tokens are used since then? for gpt-4o-mini?

@kolle86
Copy link
Author

kolle86 commented Jan 5, 2025

Since Update to latest Image, only gpt4 and no Mini usage

Screenshot_20250105-170136

@kolle86
Copy link
Author

kolle86 commented Jan 5, 2025

This ist from yesterday. I dont get it. Is gpt-4 so much more expensive then Mini? Costs where nothing in compare to todays

Screenshot_20250105-171043

@clusterzx
Copy link
Owner

I really can not reproduce this behaviour. Also you are the first one with that issue. Really strange....

@MagnusHL
Copy link

MagnusHL commented Jan 5, 2025

Maybe so much documents?

@ChiliChonka
Copy link

Had the same issue, except that i realised, that it isn't using mini anymore. And yes, it looks like that gpt-4o is much more expensive.

Btw. this would be a good hint for the documentation for the setup to check the prices, while different models costs different prices.

@clusterzx
Copy link
Owner

@ChiliChonka but do you have the same thing that it constantly requests to OpenAI?
I mean yeah gpt-4 is expensive compared to 4o-mini. But thats another topic.

@clusterzx
Copy link
Owner

clusterzx commented Jan 5, 2025

By program code it is not possible that it does this without constantly new documents or without restarting the container every second.

I repeat the parts where it pulls the API:

  1. On every start of the container it performs a check to verify the saved settings and everythin works.
  2. On the setup process itself it also sends a request, also to check if API connection works.
  3. For every document that is processed there is one request sent.

@ChiliChonka
Copy link

@ChiliChonka but do you have the same thing that it constantly requests to OpenAI? I mean yeah gpt-4 is expensive compared to 4o-mini. But thats another topic.

I have multiple requests, yes. But I cannot say exactly, what is the cause, while my health check fails many times and the container restarts.

I will investigate as soon as I have a bit more time. Currently cooking for my family. Will be back in 4 hours.

@kolle86
Copy link
Author

kolle86 commented Jan 5, 2025

i thing i finally got it. it's the health check.
i disabled the health check in my composefile and started the container - got two initial api requests (as you exlpained in your comments) and no requests followed.
enabled the health check again an got frequent ( i guess every 30 seconds/every health check) api requests again.

does that make sense? is there any possibility in the code that there is a request in the health check?

edit: looked at setupService.js myselft and if i read the code correctly, the test request is send to openai every health check? also, its send via model: "gpt-4",, which explains why i got so many gpt4 request (and not gpt4o-mini)

my screenshot from yesterday shows 2000+ requests with gpt4o-mini because in the previous image the test was send with gpt-4o-mini (before this commit

@ChiliChonka
Copy link

The docker itself is checking the healthy state of your container.

I thought about the health check again, while it's checking /health

I would say, if it is in the setup mode, the container is healthy and shouldn't fail on healthiness check. This is one thing.

I'm not sure if it is also having an unhealthy state, if paperless ngx API is not responding correctly, which could lead to those issues as well. Does it also check for paperless ngx?

@kolle86
Copy link
Author

kolle86 commented Jan 5, 2025

The docker itself is checking the healthy state of your container.

i'm talking of this health check:

#    healthcheck:
#      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]

it's an endpoint of the application to check openai-requests, paperless-api responding and so on...

@clusterzx
Copy link
Owner

Yeah that make sense, I already discovered that. Not long and I push an update.
Also I changed many other little bugs I discovered reproducing your issue along the way 😆 @kolle86

There are against my expaction really more OpenAI requests than I thought. There is a request everytime the cron routine fires. I mean that is not much of a problem unless you run the cron timer every minute. But unnecessary it is indeed, even every 1 hour for example.

@ChiliChonka
Copy link

ChiliChonka commented Jan 5, 2025

That's what docker is using to check the healthiness.

if I didn't get it wrong, it's checking different things to tell the healthcheck, that everything's works as expected or not.

We're talking about the same stuff...

@ChiliChonka
Copy link

@clusterzx what is your idea/solution?

@clusterzx
Copy link
Owner

Should be fixed with 3ef1bca

@ChiliChonka
Copy link

Am I seeing it right, that you removed the healthcheck now? Any plans to put it back at some points with another implementation for the healthiness check?

@clusterzx
Copy link
Owner

Yeah but only temporary. Main focus was to get rid of these other issues. The healthcheck will be reimplemented soon 😅

@ChiliChonka
Copy link

That's legit. You're doing a great job.

@clusterzx
Copy link
Owner

Thank you doing what I can.
And also sorry for putting a break on your work or moreover throwing you under the bus with that (probably). 😂

@clusterzx clusterzx self-assigned this Jan 5, 2025
@clusterzx clusterzx added the done Issue closed with success label Jan 5, 2025
@ChiliChonka
Copy link

Thank you doing what I can. And also sorry for putting a break on your work or moreover throwing you under the bus with that (probably). 😂

Don't be worried about that. As i've mentioned i'm quite far with the first implementation.

clusterzx added a commit that referenced this issue Jan 6, 2025
- First Startup - AI-Requests and Fetching? #32
- Assign title and correspondent in manual mode #36
- Paperless-ai making continuouesly Openai requests ? #31
- Completely reworked the dashboard
- Added settings page
- Added new manual mode
- Defined a template for later use
clusterzx added a commit that referenced this issue Jan 8, 2025
Addressing Fixes and new Features:

Fixes:
#66
#61
#58
#55
#53
#45
#59
#52
#49
#31
#37
#52

Added:
- Big New Feature: Playground
	- Try your prompts on your documents and see how they perform. In Playground no data will be updated in Paperless.
- Added Code and Markdown interpretation in Chat Mode.
- Chat Mode now works with Ollama
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done Issue closed with success
Projects
None yet
Development

No branches or pull requests

4 participants