Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding slow? #7

Closed
willmcgugan opened this issue Feb 9, 2018 · 1 comment
Closed

Encoding slow? #7

willmcgugan opened this issue Feb 9, 2018 · 1 comment

Comments

@willmcgugan
Copy link

Hi,

I've discovered that my pure Python implementation of bencode encode is faster than your Cython one. Depending on what is being encoded it can range from a few percent faster to several orders of magnitude faster. Not sure why that is the case, I would expect the Cython version to be faster in all cases.

Here's my bencode encode function.

def encode(obj):
    """
    Encode data in to bencode, return bytes.

    The following objects may be encoded: int, bytes, list, dicts.

    Dict keys must be bytes, and unicode strings will be encoded in to
    utf-8.

    """
    binary = []
    append = binary.append

    def add_encode(obj):
        """Encode an object, appending bytes to `binary` list."""
        if isinstance(obj, bytes):
            append(b'%i:%b' % (len(obj), obj))
        elif isinstance(obj, memoryview):
            append(b'%i:%b' % (len(obj), obj.tobytes()))
        elif isinstance(obj, str):
            obj_bytes = obj.encode('utf-8')
            append(b"%i:%b" % (len(obj_bytes), obj_bytes))
        elif isinstance(obj, int):
            append(b"i%ie" % obj)
        elif isinstance(obj, (list, tuple)):
            append(b"l")
            for item in obj:
                add_encode(item)
            append(b'e')
        elif isinstance(obj, dict):
            append(b'd')
            try:
                for key, value in sorted(obj.items(), key=itemgetter(0)):
                    append(b"%i:%b" % (len(key), key))
                    add_encode(value)
            except TypeError:
                raise EncodeError('dict keys must be bytes')
            append(b'e')
        else:
            raise EncodeError(
                'value {!r} can not be encoded in Bencode'.format(obj)
            )
    add_encode(obj)
    return b''.join(binary)
@whtsky
Copy link
Owner

whtsky commented Feb 11, 2018

Hi, just found that using array is incredibly slow.
Switched to list and the performance looks much better now:

bencoder.pyx: 0.095112
issue-7: 0.22329099999999996

whtsky added a commit that referenced this issue Feb 11, 2018
@whtsky whtsky closed this as completed Feb 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants