Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📦 v2023-12-07 #400

Merged
merged 2 commits into from
Dec 7, 2023
Merged

📦 v2023-12-07 #400

merged 2 commits into from
Dec 7, 2023

Conversation

kba
Copy link
Member

@kba kba commented Dec 6, 2023

Updates core to v2.59.1 which includes the workflow endpoint, additional features for chunking and additional output formats for ocrd workspace list-page; fixing the file naming in the bagger; and the filtering by file group for clone, zip bag etc.

@stweil improved the page2img script in format-converters significantly.

@mikegerber did some house cleaning work on dinglehopper and ocrd_calamari

ocrd_pagetopdf should now work properly on MacOS and supports the METS Server.

workflow-configuration contains additional XSLT to detect ID clashes and add missing confidence values, supports pretty printing XML in the CLIs and supports the METS Server.

tesseract is also updated to the latest state in master.

I will merge this tomorrow, let me know if I missed something. I forgot to click on "Create pull request". Will merge ASAP once the CI is fixed.

@stweil
Copy link
Collaborator

stweil commented Dec 6, 2023

It looks like CI has problems with ocr-fileformat, maybe because of stricter tests.

@stweil
Copy link
Collaborator

stweil commented Dec 6, 2023

Yes, the problem is in textract2page. cc @rue-a.

textract2page$ pip install .
Looking in indexes: https://pypi.org/simple, https://code.bib.uni-mannheim.de/api/packages/stweil/pypi/simple/
Processing /UB-Mannheim/ocr-fileformat/vendor/textract2page
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [212 lines of output]
      /tmp/pip-build-env-rrv2e39h/overlay/lib/python3.11/site-packages/setuptools/config/_apply_pyprojecttoml.py:75: _MissingDynamic: `description` defined outside of `pyproject.toml` is ignored.
      !!
      
              ********************************************************************************
              The following seems to be defined outside of `pyproject.toml`:
      
              `description = 'Convert AWS Textract JSON to PRImA PAGE XML'`
      
              According to the spec (see the link below), however, setuptools CANNOT
              consider this value unless `description` is listed as `dynamic`.
      
              https://packaging.python.org/en/latest/specifications/declaring-project-metadata/
      
              To prevent this problem, you can list `description` under `dynamic` or alternatively
              remove the `[project]` table from your file and rely entirely on other means of
              configuration.
              ********************************************************************************
      
      !!
        _handle_missing_dynamic(dist, project_table)
      /tmp/pip-build-env-rrv2e39h/overlay/lib/python3.11/site-packages/setuptools/config/_apply_pyprojecttoml.py:75: _MissingDynamic: `readme` defined outside of `pyproject.toml` is ignored.
      !!
[...]      

@kba
Copy link
Member Author

kba commented Dec 6, 2023

Yes, the problem is in textract2page. cc @rue-a.

textract2page$ pip install .
Looking in indexes: https://pypi.org/simple, https://code.bib.uni-mannheim.de/api/packages/stweil/pypi/simple/
Processing /UB-Mannheim/ocr-fileformat/vendor/textract2page
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [212 lines of output]
      /tmp/pip-build-env-rrv2e39h/overlay/lib/python3.11/site-packages/setuptools/config/_apply_pyprojecttoml.py:75: _MissingDynamic: `description` defined outside of `pyproject.toml` is ignored.
      !!
      
              ********************************************************************************
              The following seems to be defined outside of `pyproject.toml`:
      
              `description = 'Convert AWS Textract JSON to PRImA PAGE XML'`
      
              According to the spec (see the link below), however, setuptools CANNOT
              consider this value unless `description` is listed as `dynamic`.
      
              https://packaging.python.org/en/latest/specifications/declaring-project-metadata/
      
              To prevent this problem, you can list `description` under `dynamic` or alternatively
              remove the `[project]` table from your file and rely entirely on other means of
              configuration.
              ********************************************************************************
      
      !!
        _handle_missing_dynamic(dist, project_table)
      /tmp/pip-build-env-rrv2e39h/overlay/lib/python3.11/site-packages/setuptools/config/_apply_pyprojecttoml.py:75: _MissingDynamic: `readme` defined outside of `pyproject.toml` is ignored.
      !!
[...]      

Yeah, and I can reproduce locally, will preparare a PR after tech call

@stweil
Copy link
Collaborator

stweil commented Dec 6, 2023

See slub/textract2page#13 for a hackish fix.

@kba
Copy link
Member Author

kba commented Dec 6, 2023

See slub/textract2page#13 for a hackish fix.

Now updating ocrd_fileformat to include UB-Mannheim/ocr-fileformat#171 which in turn includes slub/textract2page#13 to test the CI.

@kba kba changed the title 📦 v2023-12-06 📦 v2023-12-07 Dec 7, 2023
@kba kba merged commit 1126724 into master Dec 7, 2023
1 check passed
@kba kba deleted the update-2023-12-06 branch December 7, 2023 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants