Cleanup Pass #124

kfsone · 2024-04-23T00:47:30Z

I'm trying to revisit various decisions I'd made during the early days that now cause problems either in performance with IDEs or get linters really bent out of shape.

There's also some work here to try and beautify things - although I'm introducing those sparingly and doing them in-place in the plugin where they seem relevant.

Some small amount of performance tuning, but it's not going to really make a noticeable difference yet.

Strongly recommend you review the individual changes; I think trying to read the entire diff would be brain-hurting...

I'm creating a pull-request before I've had chance to finish watching a complete import, mostly because of the size and to give Eyeonus a chance to ask others to review and/or reject/request changes.

eyeonus · 2024-04-23T03:07:54Z

Everything looks good code-wise, I just have to ask:

Why are some of your commits using the angular prefixes and some aren't?

eyeonus · 2024-04-23T03:11:39Z

Some tests are failing in tox:
tests/test_trade.py:76: AssertionError
=========================== short test summary info ============================
FAILED tests/test_cache.py::TestCache::test_parseSupply - assert 2 == 3
FAILED tests/test_trade.py::TestTrade::test_buy - assert 'Cost Units Dis...
FAILED tests/test_trade.py::TestTrade::test_nav - AssertionError: assert 'Sys...
FAILED tests/test_trade.py::TestTrade::test_market - AssertionError: assert F...
=================== 4 failed, 51 passed, 4 skipped in 1.46s ====================

kfsone · 2024-04-23T04:23:47Z

Everything looks good code-wise, I just have to ask:

Why are some of your commits using the angular prefixes and some aren't?

Not familiar with the term?

Also I have another commit to add: I'd added a UNIQUE(fdev_id) to StationItem and dumpPrices did not like that.

Saw your fix on the supply levels, good catch; also saw you had to fix my whitespace - I'm trying to turn off the settings I have locally to strip trailing whitespace. LMK if I'm still doing it.

kfsone · 2024-04-23T04:27:21Z

Do you have contributor docs for setting up tox / etc?

kfsone · 2024-04-23T04:29:45Z

@eyeonus BTW - What IDE/tooling are you using? (I mostly use VSCode, but when I'm trying to be good about python I use PyCharm, and I'm just setting that up for td now)

kfsone · 2024-04-23T04:43:54Z

Looking at the errors; it looks like we weren't ever actually running flake8 without running all the tests. Once flake8 is enabled ... it warns. A lot -- see incoming extra change to this branch.

kfsone · 2024-04-23T04:47:56Z

One thing a lot of python linters try to enforce is the distinction between

if happy:
  life = good
# and
if happy(
  life=good
)

so the standard formatters/styles expect keyword assignments to have no space around the equals, and include that in default arguments:

def poop(stink=bad):

unless there's a type annotation:

def poop(stink: Level = bad)

kfsone · 2024-04-23T04:48:52Z

I'm happy to fix-up code or configure linters - just lmk your preferences.

tox -e flake8

with this branch will warnapalooza at ya.

eyeonus · 2024-04-23T05:46:49Z

I'm at work right now, so just as a quick reply

I'm currently using Eclipse with PyDev, planning on switching over to PyCharm when I have the free time

angular is a commit commenting style that python-semantic-release uses to determine whether to increment the version number, whether to publish a release, automatically annotate the release with what changes occurred since last release, etc.

chore:, fix:, refactor: are examples

kfsone · 2024-04-23T07:25:09Z

Did not know about that, will adjust accordingly. Like it.

I was doing a burn-thru of warnings and errors, getting myself back to grips with the code and tooling - I hopefully got us closer to being able to use pylint which has saved my bacon hard several times, even though it's a pain sometimes.

Tomorrow I'll actually fix the errors - I think I know what they are already, and I wanted to have my ide running so I could step thru meaningfully and see it for sure :)

eyeonus · 2024-04-23T07:37:59Z

https://python-semantic-release.readthedocs.io/en/latest/commit-parsing.html

kfsone · 2024-04-23T08:19:09Z

I'll fix the commit messages in the morning. For now, I just checked in the fix that was causing the test failures.

kfsone · 2024-04-23T08:38:53Z

That should cleanup the descriptions.

eyeonus · 2024-04-23T09:32:59Z

Regarding the fdev_id in the Item table: We're using the fdev_id as the item_id nowadays, so it wouldn't be a bad idea to refactor the fdev_id away completely.

I believe the only things that use the fdev_id are my eddblink plugin, the spansh plugin, and the listener, which is a different repo.

eyeonus · 2024-04-23T09:34:23Z

On another note, that's a lot of work done while I was at work. Color me impressed. And thanks for cleaning up the commit messages.

eyeonus · 2024-04-23T13:42:33Z

Just a heads-up, someone wrote commandenv.colorize() and used it to color the output in tradecalc, not sure if that'll interfere with the rich coloring you've been doing.

kfsone · 2024-04-23T20:21:25Z

@eyeonus I saw that, and left it alone -- the colorizing is just automatic by writing through rich, and the colorize implementation in tradecalc could be replaced with use of the rich/themeing.

Also - the build failure above is failing on deploying to pypi which I would hope I'm guaranteed to fail on from my branch :)

Right now I'm looking at some ideas from the "1 billion rows challenge" and how people solved that with python. The first kick in the teeth was a nice simple one: it's way faster to process files in binary or even ascii than utf-8. in the code where we're already using "blocks()" to speed it up, you can make it 8x faster by opening the file in 'rb' and then counting b'\n'. I'm doing more tests.

Also - I saw the optimization to go back from building the list to periodically closing the database. Agh. We're caught here between just needing someone to store the data and using SQL because we have a database as our store.

The idea really was that people would go thru TradeDB directly, with sqlite being a fallback way to access the data if something was amiss or went wrong.

I wonder if we shouldn't just get rid of the DB back end entirely and just pickle the python data. That used to be slow but it's fast as fek these days, and it would mean one source of truth rather than two. We already make it so that during imports you build a -new- database with the changes, this wouldn't be any different.

It would mean that you couldn't inspect the data with sqlite or anything.

eyeonus · 2024-04-23T20:25:53Z

@eyeonus I saw that, and left it alone -- the colorizing is just automatic by writing through rich, and the colorize implementation in tradecalc could be replaced with use of the rich/themeing.

Also - the build failure above is failing on deploying to pypi which I would hope I'm guaranteed to fail on from my branch :)

Yup. That one only "succeeds" if the branch is release/v1, because obviously we don't want to publish releases otherwise

eyeonus · 2024-04-23T20:32:57Z

As far as the DB is concerned, everything in TD uses it, including TradeDB. The csv and prices files exist solely for rebuilding the DB if something goes wrong and it gets corrupted, lost, etc.

eyeonus · 2024-04-23T20:35:54Z

Also:

https://docs.python.org/3/library/pickle.html:

Warning

The pickle module is not secure. Only unpickle data you trust.

It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with.

Consider signing data with hmac if you need to ensure that it has not been tampered with.

Safer serialization formats such as json may be more appropriate if you are processing untrusted data. See Comparison with json.

kfsone · 2024-04-23T23:50:57Z

Yeah, we wouldn't be replacing the exchange formats with pickle. It's just a super fast way to "freeze" a snapshot of python's memory representation of objects:

import pickle
with open("test.pkl", "wb") as fh:
  pickle.dump({"data": [1,2,3], "names": set(("you", "me"))}, fh)

restored = pickle.load(open("test.pkl", "rb"))

gives

>>> print(restored)
{'data': [1, 2, 3], 'names': {'me', 'you'}}

But I don't think that it does as good a job at save/restore for actual entire object trees (as opposed to lists/tables of complex objects) as we'd produce if System has a direct link to Station that we want, rather than just an id reference.

I took a look at what it would require to pull of sanely, and the work to describe the structures nicely and modern-pythonically brings you also the entire way towards what would need doing in order to describe the schema to SQLAlchemy, at which point when SQLite is a bottle neck, people could simply swap the backend engine

if getOption("use_mysql"):
  tradedb.engine = SqlAlchemy.Engines.MySQL  # or so
elif getOption("use_postgres"):
  tradedb.engine .. giant_turd
  ...
  
 class TradeDB:
   def get_db(self):
     if self.db: return self.conn
     self.db= engine.connect(...)
     
     sol_system = sqlalchemy.select(System).where(name="Sol")
     print("sol is at", sol.pos_x, sol.pos_y, sol.pos_z)

kfsone · 2024-04-23T23:51:44Z

eyeonus · 2024-04-23T23:59:06Z

Let me just say, from my attempts to reign in the import time on eddblink with what is now a 2.5GB file, that SQL is definitely a bottleneck. It runs pretty quickly for the first 22,000,000 entries or so, but once the DB is over 3GB in size, it slows way down.

I mean, it gets slightly slower the whole way through, (inserting item 20,000 takes slightly longer than inserting item 10,000) but the difference becomes noticeable around 50% and gets worse from there.

- Adds Progresser to spansh_plugin as an experimental ui presentation layer using rich, -- active progress bars with timers that run asynchronously (so they shouldn't have a significant performance overhead), -- opt-out to plain-text mode incase it needs turning off quickly, - present statistics with the view to give you a sense of progress, - small perf tweaks

…to break.

The idea is that flake8 is really fast, so you give it it's own environment and it comes back and says "YA MADE A TYPO OLIVER YA DID IT AGAIN WHY YOU ALWAYS..." *cough*, sorry, anyway. I usually run tox one of two ways: ``` tox -e flake8 # for a quick what-did-I-type-wrong check tox --parallel # run em all, but, like, in parallel so I don't retire before you finish ```

turning on flake8 made linters very shouty

flake8 now runs its own environment that is JUST flake8 so it's fast, it doesn't install the package for itself; added lots of flake8 ignores for formatting issues I'm unclear on

added a lot more ignores and some file-specific ignores, but generally got it to a point where flake8 becomes proper useful.

This fixes all of the tox warnings that I haven't disabled, and several that I did.

fixed a few actual problems while I was at it.

When I first created the 'str' methods, it was because I wanted a display name but repr and str both confused me at the time; I've now renamed it text but apparently I forgot that in Py2.x def str() worked like __str__...

…mports

kfsone · 2024-04-24T01:25:41Z

nothing heavy, just some routine commands I could run on each box to make sure the data didn't just get ingested but it also didn't get digested ;)

eyeonus · 2024-04-24T01:37:51Z

You could do the commands tox does to check nothing is broken?

eyeonus · 2024-04-24T10:44:39Z

Maybe ask around on the EDCD discord? I'm sure the people there would have lots of suggestions

kfsone · 2024-04-24T17:00:51Z

I haven't been able to test on the mac, it takes forever downloading the json file and without updates: the http headers tell us the gzip'd size but requests decompresses the file as it goes, and that .json file is, uhm, lets say very compressible? It's not compressing by a percentage, it's nearly compressing by an order of magnitude :)

eyeonus · 2024-04-24T17:04:57Z

Yeah, it'd be nice if the download had progress updates.

Also, yeah, 1.4 GB -> 8.9 GB is quite the compression.

kfsone · 2024-04-24T17:49:32Z

this will conflict after you merge the transfer-timeouts fix, but after that if you're ready to merge this one I think she's good.

eyeonus · 2024-04-24T19:56:38Z

Something got borked:

elite@quoth tradedangerous]$ trade buildcache -f
NOTE: Rebuilding cache file: this may take a few moments.
/home/elite/.local/bin/trade: /home/elite/tradedangerous/tddata/TradeDangerous.prices:19 ERROR Unrecognized line/syntax,
got: "Agronomic Treatment 5430 5568 ? 489H 2024-04-23 20:49:51".

That's the first commodity in the file, and it has the correct format.

eyeonus · 2024-04-24T20:10:43Z

I'm fairly certain it's from the changes made to newItemPriceRe in cache.py

eyeonus · 2024-04-24T20:27:38Z

Yup, that was it. Reverting it fixed the problem.

kfsone had a problem deploying to pypi April 23, 2024 08:19 — with GitHub Actions Failure

kfsone force-pushed the kfsone/cleanup-pass branch 3 times, most recently from 0f93713 to c9b28b9 Compare April 23, 2024 08:38

kfsone had a problem deploying to pypi April 23, 2024 08:41 — with GitHub Actions Failure

kfsone had a problem deploying to pypi April 23, 2024 11:37 — with GitHub Actions Failure

kfsone had a problem deploying to pypi April 23, 2024 11:49 — with GitHub Actions Failure

kfsone and others added 15 commits April 23, 2024 18:22

style: blank lines have same indent as following non-blank line

877aee1

fix: Use same supply/demand levels as before

8658925

fix: having the fdev_id listed as a unique column causes dump prices …

5eb3a1a

…to break.

test: flake8 appeasement

eebf96b

turning on flake8 made linters very shouty

refactor: tox.ini

ae5e995

flake8 now runs its own environment that is JUST flake8 so it's fast, it doesn't install the package for itself; added lots of flake8 ignores for formatting issues I'm unclear on

chore: flake8 warnings

1449616

refactor: tox.ini

396d105

added a lot more ignores and some file-specific ignores, but generally got it to a point where flake8 becomes proper useful.

chore: fix all the tox warnings

d4df562

This fixes all of the tox warnings that I haven't disabled, and several that I did.

chore: bump pylint score up to 9.63/10

d466c31

fixed a few actual problems while I was at it.

fix: formatting needs __str__ methods

f8fcd96

When I first created the 'str' methods, it was because I wanted a display name but repr and str both confused me at the time; I've now renamed it text but apparently I forgot that in Py2.x def str() worked like __str__...

refactor: use ijson instead of simdjson to reduce memory use in big i…

dd497f7

…mports

refactor: logic reversal for plugin option handling

5e54029

fix: presentation was a bit wonky with doubled progress bars

0396566

kfsone force-pushed the kfsone/cleanup-pass branch from 5137077 to 0396566 Compare April 24, 2024 01:23

kfsone had a problem deploying to pypi April 24, 2024 01:25 — with GitHub Actions Failure

Merge branch 'release/v1' into kfsone/cleanup-pass

d5ef25d

eyeonus had a problem deploying to pypi April 24, 2024 18:15 — with GitHub Actions Failure

eyeonus merged commit 7c60def into eyeonus:release/v1 Apr 24, 2024
10 of 11 checks passed

kfsone deleted the kfsone/cleanup-pass branch April 25, 2024 03:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup Pass #124

Cleanup Pass #124

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024 •

edited

Loading

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024 •

edited

Loading

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024 •

edited

Loading

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024 •

edited

Loading

kfsone commented Apr 24, 2024

eyeonus commented Apr 24, 2024

eyeonus commented Apr 24, 2024

kfsone commented Apr 24, 2024

eyeonus commented Apr 24, 2024

kfsone commented Apr 24, 2024

eyeonus commented Apr 24, 2024

eyeonus commented Apr 24, 2024

eyeonus commented Apr 24, 2024

Cleanup Pass #124

Cleanup Pass #124

Conversation

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024 • edited Loading

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024 • edited Loading

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024 • edited Loading

eyeonus commented Apr 23, 2024

eyeonus commented Apr 23, 2024

kfsone commented Apr 23, 2024

kfsone commented Apr 23, 2024

eyeonus commented Apr 23, 2024 • edited Loading

kfsone commented Apr 24, 2024

eyeonus commented Apr 24, 2024

eyeonus commented Apr 24, 2024

kfsone commented Apr 24, 2024

eyeonus commented Apr 24, 2024

kfsone commented Apr 24, 2024

eyeonus commented Apr 24, 2024

eyeonus commented Apr 24, 2024

eyeonus commented Apr 24, 2024

eyeonus commented Apr 23, 2024 •

edited

Loading

eyeonus commented Apr 23, 2024 •

edited

Loading

eyeonus commented Apr 23, 2024 •

edited

Loading

eyeonus commented Apr 23, 2024 •

edited

Loading