-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document limitations of Bounter #37
Conversation
merge Rare into aneesh
Thanks @aneesh-joshi @piskvorky any comments? |
README.md
Outdated
```python | ||
from bounter import bounter | ||
bounts = bounter(size_mb=1) | ||
bounts.update(str(i) for i in range(1000000)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add another 0 (10000000
), to make it clear this is not related to 1 MB.
README.md
Outdated
|
||
|
||
## When not to use Bounter? | ||
Beware, Bounter is only a probabilistic frequency counter and cannot be relied on for fine counting. (You can't expect a data structure with finite size to hold infinite data.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine
=> exact
README.md
Outdated
0 | ||
``` | ||
|
||
Please use `Counter` or `dict` when such fine counts matter. When they don't matter, like in most NLP applications with a huge corpora, Bounter is a very good alternative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine
=> exact
.
README.md
Outdated
0 | ||
``` | ||
|
||
Please use `Counter` or `dict` when such fine counts matter. When they don't matter, like in most NLP applications with a huge corpora, Bounter is a very good alternative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…like in most NLP and ML applications with huge datasets, …
README.rst
Outdated
When not to use Bounter? | ||
------------------------ | ||
|
||
Beware, Bounter is only a probabilistic frequency counter and cannot be relied on for fine counting. (You can't expect a data structure with finite size to hold infinite data.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file auto-generated? Or why are we keeping two parallel READMEs?
Anyway, the same changes here please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I propose to stay only .rst
(because this used for PyPI too) and drop markdown version (to don't support 2 almost same versions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if they're out of sync already… @aneesh-joshi can you check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piskvorky
I will check. The .rst is not auto generated. I manually edited it.
Good idea! Thanks. |
@piskvorky README rst & md really not synced yet. I still think that we should stay only one version (create best .rst version & drop markdown version to avoid "de-sync" state). |
Did we already discuss this? To me a single version ( |
I don't remember (I'm even not sure if we discussed this) Maybe @isamaru remember? Right now I see no pros to maintain both |
@menshikh-iv @piskvorky Some questions:
|
README.md
Outdated
bounts['100'] | ||
0 | ||
``` | ||
|
||
Please use `Counter` or `dict` when such fine counts matter. When they don't matter, like in most NLP applications with a huge corpora, Bounter is a very good alternative. | ||
Please use `Counter` or `dict` when such exact counts matter. When they don't matter, like in most NLP and ML applications with a huge datasets, Bounter is a very good alternative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a huge datasets
=> huge datasets
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shoot! Missed that somehow.
README.rst
Outdated
bounts['100'] | ||
0 | ||
|
||
Please use ``Counter`` or ``dict`` when such fine counts matter. When they don't matter, like in most NLP applications with a huge corpora, Bounter is a very good alternative. | ||
Please use ``Counter`` or ``dict`` when such exact counts matter. When they don't matter, like in most NLP and ML applications with a huge datasets, Bounter is a very good alternative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dtto
Yes
is http://pandoc.org/try/ works OK? |
@menshikh-iv but the generated file has a huge diff compared with the original .rst |
@aneesh-joshi that's expected: if you looking into commit history, you'll see a huge diff between this 2 files. Here you need to do some manual work ("join" both to one file). |
@menshikh-iv |
@aneesh-joshi okay, but please raise an issue about syncing |
@menshikh-iv |
ping @menshikh-iv @piskvorky |
Thanks @aneesh-joshi 🚀 |
Credit goes to the people on this page.
I merely worded their concerns.
Fixes #36 .
CC: @menshikh-iv