Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyasn (asn_lookup expert) is somewhat broken #1517

Closed
gethvi opened this issue Apr 29, 2020 · 4 comments
Closed

pyasn (asn_lookup expert) is somewhat broken #1517

gethvi opened this issue Apr 29, 2020 · 4 comments
Labels
bug Indicates an unexpected problem or unintended behavior component: contrib usability
Milestone

Comments

@gethvi
Copy link
Contributor

gethvi commented Apr 29, 2020

  • I encountered a problem with pyasn, partially in scope of IntelMQ, when using script "update-asn-data". Also the requirements for this bot are freezed at pyasn==1.5.0b7. The script downloads the latest data (latest-bview.gz), but when it runs pyasn_util_convert.py, it fails with the following error:
root@#####:/tmp# pyasn_util_convert.py --single latest-bview.gz ipasn.dat
MRT RIB log importer 1.5.0b7
Traceback (most recent call last):
  File "/usr/local/bin/pyasn_util_convert.py", line 53, in <module>
    dat = mrtx.parse_mrt_file(f, print_progress=print_progress)
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 64, in parse_mrt_file
    mrt = MrtRecord.next_dump_table_record(mrt_file)
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 208, in next_dump_table_record
    buf = f.read(header_len)  # read table-header
  File "/usr/lib/python3.5/bz2.py", line 181, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.5/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.5/_compression.py", line 103, in read
    data = self._decompressor.decompress(rawblock, size)
OSError: Invalid data stream
  • As the pyasn comes with it owns utility (pyasn_util_download.py) for downloading the data I tried that as well using the same version (pyasn==1.5.0b7). This also failed.
root@#####:/tmp# pyasn_util_download.py --latest
Connecting to ftp://archive.routeviews.org
Finding latest RIB file in /bgpdata/2020.05/RIBS/ ...
Finding latest RIB file in /bgpdata/2020.04/RIBS/ ...
Downloading rib.20200429.1400.bz2
 100%, 11376KB/s
 Download complete.

root@#####:/tmp# pyasn_util_convert.py --single rib.20200429.1400.bz2 ipasn.dat
MRT RIB log importer 1.5.0b7
parse_mrt_file(): starting  parse for MrtTable(ts:1588168800, type:13, sub-type:1, data-len:840, seq:None, prefix:None)
Traceback (most recent call last):
  File "/usr/local/bin/pyasn_util_convert.py", line 53, in <module>
    dat = mrtx.parse_mrt_file(f, print_progress=print_progress)
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 89, in parse_mrt_file
    assert mrt.type == mrt.TYPE_TABLE_DUMP
AssertionErrorr
  • I decided to give it try with latest version of pyasn==1.6.0b1 installed with pip. This attempt produced a different error (more related to pyasn itself I believe) while parsing the data (downloaded with update-asn-data script):
root@#####:/tmp# pyasn_util_convert.py --single latest-bview.gz ipasn.dat
Parsing MRT/RIB archive ..  MrtTD2Record (PEER-INDEX-TABLE, collector 1347535000, 120 peers)
  MRT record 100000 @39s
  MRT record 200000 @65s
  MRT record 300000 @82s
  MRT record 400000 @97s
  MRT record 500000 @112s
  MRT record 600000 @131s
  Exception parsing prefix record 185.206.208.42/32
Traceback (most recent call last):
  File "/usr/local/bin/pyasn_util_convert.py", line 71, in <module>
    skip_record_on_error=args.skip_on_error)
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 106, in parse_mrt_file
    origin = mrt.get_first_origin_as()
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 311, in get_first_origin_as
    return path.get_origin_as()
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 578, in get_origin_as
    assert origin  # eventually, should not be 0 (no asn 0), or None, or an empty set
AssertionError
  • Next I tried using the pyasn==1.6.0b1 utility for downloading the data and then converting it. This worked!
root@#####:/tmp# pyasn_util_download.py -latestv46
Connecting to ftp://archive.routeviews.org
Finding most recent archive in /route-views4/bgpdata/2020.05/RIBS ...
Finding most recent archive in /route-views4/bgpdata/2020.04/RIBS ...
Downloading ftp://archive.routeviews.org//route-views4/bgpdata/2020.04/RIBS/rib.20200429.1600.bz2
 99%, 12189KB/s
Download complete.

root@#####:/tmp# pyasn_util_convert.py --single rib.20200429.1600.bz2 ipasn46.dat
Parsing MRT/RIB archive ..  MrtTD2Record (PEER-INDEX-TABLE, collector 2162111247, 101 peers)
  MRT record 100000 @11s
  MRT record 200000 @23s
  MRT record 300000 @37s
  MRT record 400000 @49s
  MRT record 500000 @63s
  MRT record 600000 @75s
  MRT record 700000 @87s
  MRT record 800000 @99s
  MRT record 900000 @112s
IPASN database saved (878114 IPV4 + 89765 IPV6 prefixes)
root@#####:/tmp# pyasn_util_convert.py --single latest-bview.gz ipasn.dat
Parsing MRT/RIB archive ..  MrtTD2Record (PEER-INDEX-TABLE, collector 1347535000, 120 peers)
  MRT record 100000 @40s
  MRT record 200000 @72s
  MRT record 300000 @95s
  MRT record 400000 @123s
  MRT record 500000 @142s
  MRT record 600000 @162s
  MRT record 700000 @185s
  MRT record 800000 @205s
  MRT record 900000 @225s
IPASN database saved (836044 IPV4 + 90355 IPV6 prefixes)
  • Using the pyasn==1.5.0b7 for downloading the data with pyasn_util_download.py fetches database with only IPv4 prefixes.

Therefore I believe the conclusion is:

  • replace pyasn==1.5.0b7 for latest pyasn==1.6.0b1 in requirements.txt (it does not seem to break the asn_lookup bot)
  • use pyasn_util_download.py instead of curl for downloading the data inside script update-asn-data
@ghost
Copy link

ghost commented Apr 29, 2020

Also the requirements for this bot are freezed at pyasn==1.5.0b7.

That's definitely a bug.

* I decided to give it try with latest version of pyasn==1.6.0b1 installed with pip. This attempt produced a different error (more related to pyasn itself I believe) while parsing the data (downloaded with update-asn-data script):
root@#####:/tmp# pyasn_util_convert.py --single latest-bview.gz ipasn.dat
Parsing MRT/RIB archive ..  MrtTD2Record (PEER-INDEX-TABLE, collector 1347535000, 120 peers)
  MRT record 100000 @39s
  MRT record 200000 @65s
  MRT record 300000 @82s
  MRT record 400000 @97s
  MRT record 500000 @112s
  MRT record 600000 @131s
  Exception parsing prefix record 185.206.208.42/32
Traceback (most recent call last):
  File "/usr/local/bin/pyasn_util_convert.py", line 71, in <module>
    skip_record_on_error=args.skip_on_error)
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 106, in parse_mrt_file
    origin = mrt.get_first_origin_as()
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 311, in get_first_origin_as
    return path.get_origin_as()
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 578, in get_origin_as
    assert origin  # eventually, should not be 0 (no asn 0), or None, or an empty set
AssertionError

Known issue: hadiasghari/pyasn#62 caused by a change in the RIPE data.

* replace pyasn==1.5.0b7 for latest pyasn==1.6.0b1 in requirements.txt (it does not seem to break the asn_lookup bot)

👍

* use pyasn_util_download.py instead of curl for downloading the data inside script update-asn-data

That would be a sensible workaround

@gethvi
Copy link
Contributor Author

gethvi commented Apr 30, 2020

After giving it a bit more thought, I came up with perhaps a cleaner solution. These update scripts (asn data, tor nodes, geoip, ..) could be part of the bot code itself (or a separate python script) and called with something like:

intelmq.bots.experts.asn_lookup.expert --update-database

The code would go something like this:

if __name__ == '__main__':
   if "--update-database":
     1. find the correct bot-id in RUNTIME_CONF_FILE (search by module key)
     2. get the database key from RUNTIME_CONF_FILE (path to the file)
     3. download and process the new data file and place it in the database path
     4. reload the affected bot (bots), we know their id

Currently you need to run the update script and manually supply the database path (if it changes in the configuration, you need to manually update the cronjob or whatever you use to run the script). Then you need to reload the affected bot, manually again, because the bot-id can change (or edit the cronjob again). With my suggestion both of these tasks can be done automatically based on the RUNTIME_CONF_FILE. No need to manually update anything. This is what I consider the biggest advantage of my suggestion.

For the geoip bot, the required maxmind licence could be placed in the bot configuration as well. Less mess, no need for env variable.

I could look into such solution and try to implement a PoC for one of the bots for a start. Do you think this would be interesting to the project?

@ghost ghost added bug Indicates an unexpected problem or unintended behavior component: contrib usability labels May 14, 2020
@ghost ghost added this to the 2.1.3 milestone May 14, 2020
@ghost
Copy link

ghost commented May 14, 2020

I could look into such solution and try to implement a PoC for one of the bots for a start. Do you think this would be interesting to the project?

Definitely yes! Really thank you for time you are putting into these efforts! Like I said in #1524, this is really a nice feature which makes IntelMQ more usable

@ghost ghost closed this as completed in e8913f9 May 14, 2020
@ghost
Copy link

ghost commented Jul 9, 2020

* I decided to give it try with latest version of pyasn==1.6.0b1 installed with pip. This attempt produced a different error (more related to pyasn itself I believe) while parsing the data (downloaded with update-asn-data script):
root@#####:/tmp# pyasn_util_convert.py --single latest-bview.gz ipasn.dat
Parsing MRT/RIB archive ..  MrtTD2Record (PEER-INDEX-TABLE, collector 1347535000, 120 peers)
  MRT record 100000 @39s
  MRT record 200000 @65s
  MRT record 300000 @82s
  MRT record 400000 @97s
  MRT record 500000 @112s
  MRT record 600000 @131s
  Exception parsing prefix record 185.206.208.42/32
Traceback (most recent call last):
  File "/usr/local/bin/pyasn_util_convert.py", line 71, in <module>
    skip_record_on_error=args.skip_on_error)
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 106, in parse_mrt_file
    origin = mrt.get_first_origin_as()
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 311, in get_first_origin_as
    return path.get_origin_as()
  File "/usr/local/lib/python3.5/dist-packages/pyasn/mrtx.py", line 578, in get_origin_as
    assert origin  # eventually, should not be 0 (no asn 0), or None, or an empty set
AssertionError

Known issue: hadiasghari/pyasn#62 caused by a change in the RIPE data.

FWIW: That bug is fixed in the current master branch of the pyasn project and will be part of the next pyasn release.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior component: contrib usability
Projects
None yet
Development

No branches or pull requests

1 participant