Convert numpy.floating values in meta.json #13644

honnibal · 2024-09-30T20:27:28Z

Ports over a numpy v2 compatibility change from v3.8

* Add workflow files for cibuildwheel * Add config for cibuildwheel * Set version for experimental prerelease * Try updating cython * Skip 32-bit windows builds * Revert "Try updating cython" This reverts commit c1b794a. * Try to import cibuildwheel settings from previous setup

Implemented a foundational Scottish Gaelic (gd) language option with tokenizer_exceptions and stop_words files.

* Add Kurdish Kurmanji language * Add lex_attrs

Add a context manage nlp.memory_zone(), which will begin memory_zone() blocks on the vocab, string store, and potentially other components. Example usage: ``` with nlp.memory_zone(): for text in nlp.pipe(texts): do_something(doc) # do_something(doc) <-- Invalid ``` Once the memory_zone() block expires, spaCy will free any shared resources that were allocated for the text-processing that occurred within the memory_zone. If you create Doc objects within a memory zone, it's invalid to access them once the memory zone is expired. The purpose of this is that spaCy creates and stores Lexeme objects in the Vocab that can be shared between multiple Doc objects. It also interns strings. Normally, spaCy can't know when all Doc objects using a Lexeme are out-of-scope, so new Lexemes accumulate in the vocab, causing memory pressure. Memory zones solve this problem by telling spaCy "okay none of the documents allocated within this block will be accessed again". This lets spaCy free all new Lexeme objects and other data that were created during the block. The mechanism is general, so memory_zone() context managers can be added to other components that could benefit from them, e.g. pipeline components. I experimented with adding memory zone support to the tokenizer as well, for its cache. However, this seems unnecessarily complicated. It makes more sense to just stick a limit on the cache size. This lets spaCy benefit from the efficiency advantage of the cache better, because we can maintain a (bounded) cache even if only small batches of documents are being processed.

Co-authored-by: marinelay <[email protected]>

Co-authored-by: Sofie Van Landeghem <[email protected]> Co-authored-by: Ines Montani <[email protected]>

Co-authored-by: Halvani <>

Co-authored-by: Ines Montani <[email protected]>

svlandeg and others added 30 commits May 15, 2024 12:11

Bump version to 3.7.5 (#13493)

82fc2ec

Remove typing-extensions from requirements (#13516)

a6d0fc3

Disable extra CI

f78e5ce

Add case study [ci skip]

8cda27a

Set version to 3.7.6

319e025

Added gd language folder (#13570)

55db9c2

Implemented a foundational Scottish Gaelic (gd) language option with tokenizer_exceptions and stop_words files.

Add Kurdish Kurmanji language (#13561)

acbf2a4

* Add Kurdish Kurmanji language * Add lex_attrs

add Tibetan (#13510)

608f65c

Set version to v3.8.0.dev0

b65491b

Format

59ac7e6

Fix memory zones

a019315

Format

4cc3ebe

Delete unnecessary method (#13441)

b18cc94

Co-authored-by: marinelay <[email protected]>

added gliner-spacy to universe (#13417) [ci skip]

5a7ad55

Co-authored-by: Sofie Van Landeghem <[email protected]> Co-authored-by: Ines Montani <[email protected]>

Added: Constituent-Treelib to: universe.json (#13432) [ci skip]

54dc4ee

Co-authored-by: Halvani <>

universe-package-quelquhui (#13514) [ci skip]

0190e66

Co-authored-by: Ines Montani <[email protected]>

universe-project-presque (#13515) [ci skip]

081e4e3

Co-authored-by: Ines Montani <[email protected]>

added bagpipes-spacy to universe (#13425) [ci skip]

89c1774

Co-authored-by: Ines Montani <[email protected]>

updated universe for number spacy (#13424) [ci skip]

7fbbb20

Co-authored-by: Ines Montani <[email protected]>

added spacy annoy to universe (#13416) [ci skip]

c80dacd

Co-authored-by: Ines Montani <[email protected]>

added spacy whisper to universe (#13418) [ci skip]

f1a5ff9

Co-authored-by: Ines Montani <[email protected]>

Added Date spaCy to universe (#13415) [ci skip]

30f1f33

Co-authored-by: Ines Montani <[email protected]>

Update numpy pin

184e508

Fix dependencies

c068e1d

Try enabling macos-14 for arm builds

1869a19

Set version to v3.8.0

b427597

Set version to v3.8.1

69ecb85

Update cibuildwheel

419bfaf

honnibal added 14 commits September 13, 2024 12:35

Remove aarch

83b4015

Fix thinc pin

a0ce61f

Try skipping 686

3a635d2

Skip running tests on PRs

8dcc4b8

Merge numpy version update

8266031

Merge branch 'master' of https://github.com/explosion/spaCy

50aa3b5

Set version to v3.7.7

3c3d750

Format

e2dc9b7

Lint

2f1e7ed

Replace numpy floats in evaluate and update

a9ed8bb

Fix numpy floating values in meta.json for serialization

57cbac7

Format

5dde59a

Add missing import

e36a178

Remove obsolete python versions from tests

07dba26

honnibal changed the base branch from master to v4 October 1, 2024 21:50

honnibal closed this Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert numpy.floating values in meta.json #13644

Convert numpy.floating values in meta.json #13644

honnibal commented Sep 30, 2024

Convert numpy.floating values in meta.json #13644

Convert numpy.floating values in meta.json #13644

Conversation

honnibal commented Sep 30, 2024