Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large cocina object won't process #628

Open
jcoyne opened this issue Aug 18, 2023 · 3 comments
Open

Large cocina object won't process #628

jcoyne opened this issue Aug 18, 2023 · 3 comments

Comments

@jcoyne
Copy link
Contributor

jcoyne commented Aug 18, 2023

The item fv412tg4309 has 20,001 file sets. It looks like the controller created a PURL, but the kafka message was never processed by PurlUpdatesConsumer. It's possible the message was too large for Kafka and was rejected. This means the object never gets indexed for Searchworks although the purl page does display.

The json is 15Mb:

ls -lah  /purl/document_cache/fv/412/tg/4309/cocina.json
-rw-rw-r-- 1 lyberadmin lyberteam 15M Aug 18 08:31 /purl/document_cache/fv/412/tg/4309/cocina.json
@thatbudakguy
Copy link
Member

thatbudakguy commented Aug 18, 2023

We do occasionally see PurlsController#update throw:

Rdkafka::RdkafkaError: Broker: Message size too large (msg_size_too_large)

Honeybadger error with a few occurrences in the last few months: https://app.honeybadger.io/projects/48916/faults/96491970

We also increased the max message size about a year ago:
https://github.com/sul-dlss/puppet/commit/12a2b064dff7f0052139a27f540936f094be364d

@jcoyne
Copy link
Contributor Author

jcoyne commented Aug 18, 2023

It looks like druid:fv412tg4309 and druid:tz904vs1912 may be the only objects that hit this so far. Interestingly druid:tz904vs1912 does not have a directory in the /purl/document_cache and argo doesn't seem to know about it either. It may have been a test object as it's title was "25K files in one with a new title"

@andrewjbtw
Copy link

andrewjbtw commented Aug 18, 2023

tz904vs1912 is in the Argo stage environment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants