Why yet another bencode package in python?
because I need a bencode library:
It should fully validate its inputs, both encoded bencode bytes, or python object to be encoded.
And it should not decode bencode bytes to str
by default.
Bencode doesn't have a utf-8 str type, only bytes, so many decoder try to decode bytes to str and fallback to bytes, this package won't, it parse bencode bytes value as python bytes.
It may be attempting to parse all dictionary keys as string,
but for BitTorrent v2 torrent, the keys in pieces root
dictionary is still sha256 hash
instead of ascii/utf-8 string.
If you prefer string as dictionary keys, write a dedicated function to convert parsing result.
Also be careful! Even file name or torrent name may not be valid utf-8 string.
this package is written with c++ in CPython.
This package sill have a pure python wheel bencode2-${version}-py3-none-any.whl
wheel
on pypi.
Which means you can still use it in non-cpython python with same behavior.
pip install bencode2
import bencode2
assert bencode2.bdecode(b"d4:spaml1:a1:bee") == {b"spam": [b"a", b"b"]}
assert bencode2.bencode({'hello': 'world'}) == b'd5:hello5:worlde'
bencode type | python type |
---|---|
integer | int |
string | bytes |
array | list |
dictionary | dict |
bencode have 4 native types, integer, string, array and dictionary.
This package will decode integer to int
, array to list
and
dictionary to dict
.
Because bencode string is not defined as utf-8 string, and will contain raw bytes
bencode2 will decode bencode string to python bytes
.
python type | bencode type |
---|---|
bool |
integer 0/1 |
int , enum.IntEnum |
integer |
str , enum.StrEnum |
string |
bytes , bytearray ,memoryview |
string |
list , tuple , NamedTuple |
array |
dict , OrderedDict |
dictionary |
types.MaapingProxy |
dictionary |
dataclasses | dictionary |
bencode2 have a free threading wheel on pypi, build with GIL disabled.
When encoding or decoding, it will not acquire GIL and may call non-thread-safy c-api, which mean it's the caller's responsibility to ensure thread safety.
When calling bencode
, it's safe to encode same object in multiple threading,
but it's not safe to encoding a object and change it in another thread at same time.
Also, when decoding, bytes
objects are immutable so it's safe to be used in multiple
threading,
but memoryview
and bytearray
maybe not, please make sure underlay data doesn't
change when decoding.
This project use meson for building.
For testing pure python library,
make sure all so/pyd files in src/bencode2
are removed, then run
PYTHONPATH=src pytest --assert-pkg-compiled=false
.
For testing native extension, meson-python doesn't provide same function with
python setup.py build_ext --inplace
.
So you will need to run command like this:
meson setup build
meson compile -C build
ninja -C build copy
ninja will need to build so/pyd with meson and copy it to src/bencode2
,
then run tests with PYTHONPATH=src pytest --assert-pkg-compiled=true
.