Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'Reader' object has no attribute 'shp' #120

Closed
llimllib opened this issue Mar 23, 2021 · 7 comments · Fixed by #127
Closed

AttributeError: 'Reader' object has no attribute 'shp' #120

llimllib opened this issue Mar 23, 2021 · 7 comments · Fixed by #127

Comments

@llimllib
Copy link

When I run the shapefile import example from the docs, I get a couple errors. The first two are SyntaxWarnings from topojson (would you like me to submit a PR?), and the more concerning one is an AttributeError:

$ cat test.py
import topojson as tp
import shapefile

data = shapefile.Reader("tests/files_shapefile/southamerica.shp")
topo = tp.Topology(data)
topo.toposimplify(4).to_svg()

$ python test.py
/Users/llimllib/code/topojson/topojson/core/hashmap.py:331: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if len(arc_ids) > 1 and key is not "coordinates":
/Users/llimllib/code/topojson/topojson/core/extract.py:62: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if instance(data) is "Collection":  # fiona.Collection)
Exception ignored in: <function Reader.__del__ at 0x10de9dee0>
Traceback (most recent call last):
  File "/Users/llimllib/.pyenv/versions/3.8.5/lib/python3.8/site-packages/shapefile.py", line 981, in __del__
    self.close()
  File "/Users/llimllib/.pyenv/versions/3.8.5/lib/python3.8/site-packages/shapefile.py", line 984, in close
    for attribute in (self.shp, self.shx, self.dbf):
AttributeError: 'Reader' object has no attribute 'shp'
MULTILINESTRING ((-68.63402724102814 -52.63637867677012, -66.95990668944665 -54.896837042809), (-58.42708747121122 -33.90944264935121, -68.57156806867735 -52.29946708154288), (-68.63402724102814 -52.63637867677012, -66.95990668944665 -54.896837042809), (-66.95990668944665 -54.896837042809, -71.00570191251819 -55.0538265500175, -74.66255107876859 -52.83746406636747, -68.63402724102814 -52.63637867677012), (-67.10667174017226 -22.73589990786009, -73.41542159556722 -49.31843572381318, -68.57156806867735 -52.29946708154288), (-68.57156806867735 -52.29946708154288, -75.60802796725341 -48.67380564068095, -70.37256756678305 -18.34795131496109), (-61.19998530141202 -51.85000210750209, -58.54999853309434 -51.10003186088537, -57.74997962745334 -51.54997317933467, -59.40001278521343 -52.19997914936307, -61.19998530141202 -51.85000210750209), (-53.37368295427453 -33.76837665522758, -58.42708747121122 -33.90944264935121), (-58.42708747121122 -33.90944264935121, -57.62515464474333 -30.21627640088164), (-53.37368295427453 -33.76837665522758, -57.62515464474333 -30.21627640088164), (-57.62515464474333 -30.21627640088164, -54.62529381307057 -25.73925140358909), (-54.52474294816506 2.311854223836249, -34.72999345553303 -7.343238641690284, -53.37368295427453 -33.76837665522758), (-69.52969550701798 -10.95175168384543, -58.16637410979186 -20.17670554847692), (-62.68504782009315 -22.24900787314974, -67.10667174017226 -22.73589990786009), (-67.10667174017226 -22.73589990786009, -69.59042748252497 -17.58001607922297), (-69.89362055010218 -4.298172985615032, -73.98721711285026 -7.523841221721206, -69.52969550701798 -10.95175168384543), (-69.52969550701798 -10.95175168384543, -69.59042748252497 -17.58001607922297), (-69.59042748252497 -17.58001607922297, -70.37256756678305 -18.34795131496109), (-70.37256756678305 -18.34795131496109, -80.3025489886499 -3.404823072033288), (-66.87634770700429 1.25334889889993, -69.89362055010218 -4.298172985615032), (-69.89362055010218 -4.298172985615032, -75.37322255849077 -0.1520032046413746), (-78.85525139555307 1.380941151182512, -71.33158194404345 11.77627322755168), (-60.7335954725954 5.200270618709098, -66.87634770700429 1.25334889889993), (-66.87634770700429 1.25334889889993, -71.33158194404345 11.77627322755168), (-71.33158194404345 11.77627322755168, -59.75828942780852 8.367008246561028), (-56.53940136394603 1.899544113660156, -60.7335954725954 5.200270618709098), (-60.7335954725954 5.200270618709098, -59.75828942780852 8.367008246561028), (-59.75828942780852 8.367008246561028, -57.14742133395269 5.97317344613608), (-54.52474294816506 2.311854223836249, -56.53940136394603 1.899544113660156), (-56.53940136394603 1.899544113660156, -57.14742133395269 5.97317344613608), (-57.14742133395269 5.97317344613608, -54.52474294816506 2.311854223836249), (-75.37322255849077 -0.1520032046413746, -80.3025489886499 -3.404823072033288), (-80.3025489886499 -3.404823072033288, -78.85525139555307 1.380941151182512), (-78.85525139555307 1.380941151182512, -75.37322255849077 -0.1520032046413746), (-58.16637410979186 -20.17670554847692, -54.62529381307057 -25.73925140358909), (-54.62529381307057 -25.73925140358909, -62.68504782009315 -22.24900787314974), (-62.68504782009315 -22.24900787314974, -58.16637410979186 -20.17670554847692))

After I installed fiona (which I just guessed from the code snippet above), the SyntaxErrors went away, but the AttributeError remained.

Here are the libraries I installed before trying to use topojson:

pip install topojson simplification altair geopandas ipywidgets pyshp geojson fiona

Am I missing some other library that I need?

@mattijn
Copy link
Owner

mattijn commented Mar 23, 2021

Thanks for raising the issue! I created #121 for the syntaxwarning and some other style improvements. You have installed the right packages.

I can reproduce the AttributeError, but it creates a correctly topojson. I'm a bit puzzled why the AttributeError is raised.

I tried to debug it a bit, but in the end its not much more (at least it should not be much more) than:

import topojson as tp
import shapefile
import copy

data = shapefile.Reader("tests/files_shapefile/southamerica.shp")
data = copy.deepcopy(data.__geo_interface__)
topo = tp.Topology([data])
topo.toposimplify(4).to_svg()

image

And like this it doesn't give an issue.

You also could do:

import fiona
with fiona.open("tests/files_shapefile/southamerica.shp") as data:
    topo = tp.Topology(data)
    topo.toposimplify(4).to_svg()   

or

import geopandas as gpd
data = gpd.read_file("tests/files_shapefile/southamerica.shp")
topo = tp.Topology(data)
topo.toposimplify(4).to_svg()    

Which also will not give an AttributeError.

But you are completely right about the AttributeError using shapefile (pyshp). I'm very much open for suggestions to fix it.

@llimllib
Copy link
Author

I have no idea what's going on or how to fix it, I just wanted to report it. Like you say the topo object actually looks fine so I'm just leaving the ugly error in my notebook and moving forward.

@mattijn
Copy link
Owner

mattijn commented Mar 24, 2021

Fair enough. Sorry for the not-so great aesthetic notebook pleasure.

@llimllib
Copy link
Author

No worries! Hope the report was useful and not annoying, and thanks for the library, very useful

@karimbahgat
Copy link

Came across this issue and thought it was very weird since i know for sure pyshp’s Reader should always have a .shp attribute; it’s set at init and is never explicitly deleted. But i think i figured it out. In the example, the Topology class is given the entire pyshp Reader instance as its input data. I followed the init code through several layers of subclasses until finally in the extract.Extract class it does a try-except deepcopy of the input data, before it sends it to the _extractor method where it eventually looks for the geo interface. What i think is happening is the deepcopy succeeds in copying the Reader class, except it «does not copy types like module, method, stack trace, stack frame, file, socket, window, array, or any similar types». Indeed .shp is a file object so then it probably doesn’t copy that attribute. Which is why later when the deepcopied Reader gets garbage collected and tries to close the .shp file it can’t find it and raises an exception.

I’m not fully sure why you have the deepcopy there in the first place, except for a link in the code to a geopandas github issue. Would it be possible to do a more targeted check specific to the geopandas issue, so it only does the deepcopy for that use case?

Also, i noticed there are other attempts at deepcopy of the geo interface dict later on, which i’m wondering might be an unnecessary copy of data (and might be a bottleneck/slowdown for very large datasets) since presumably all the data gets copied into numpy arrays pretty soon after anyway?

@mattijn
Copy link
Owner

mattijn commented Sep 1, 2021

Thanks so much for digging this deep.
I have to do a deepcopy otherwise I'm modifying the source data. A shallow copy can not prevent this from happening.
I think I still have situations that the source data is modified because of topojson, it should not happen.

The GeoPandas try/except part was because of GeoPandas not directly supporting deepcopy on df.copy(deep=True), this is fixed now.

For what it is worth, per documentation, https://mattijn.github.io/topojson/how-it-works.html#extract, you can skip all subclasses and work directly with the Extract class as such:

from topojson.core.extract import Extract
import shapefile

data = shapefile.Reader("tests/files_shapefile/southamerica.shp")
Extract(data)

This probably makes debugging easier.
Maybe, in case if data originates from pyshp shape file it does not need a deepcopy on the __geo_interface__? Currently there are special approaches for certain datatypes (geodataframe/geoseries etc) and if these not match it tries to detect a __geo_interface__ attribute,

I also can reproduce the pyshp AttributeError. Its really weird, look:

data = shapefile.Reader("tests/files_shapefile/southamerica.shp")
# https://github.com/mattijn/topojson/blob/master/topojson/core/extract.py#L73
copydata = data  # just reassigning, no copies 
geom = copy.deepcopy(copydata.__geo_interface__)
# BOOM:
AttributeError: 'Reader' object has no attribute 'shp'

But like this there is no raising of AttributeError

data = shapefile.Reader("tests/files_shapefile/southamerica.shp")
geom = copy.deepcopy(data.__geo_interface__)
# ALL FINE

@karimbahgat
Copy link

So I'm not sure I understand your reassigning example, but after some more tests deepcopy is definitely the culprit, but it only shows up when it's inside the try-except clause:

# No .shp error, only a deepcopy "TypeError: cannot pickle '_io.BufferedReader' object"
import shapefile, copy
r = shapefile.Reader(r"topojson\tests\files_shapefile\southamerica.shp")
copy.deepcopy(r)

# After landing in the except clause, prints 'failed', then prints the .shp error
import shapefile, copy
r = shapefile.Reader(r"topojson\tests\files_shapefile\southamerica.shp")
try: copy.deepcopy(r)
except: print('failed')

I think probably what happens is that deepcopy is able to partially copy the Reader class except for the .shp and other attributes since deepcopy cannot copy BytesIO objects. When that partially created Reader class later gets garbage collected (since the deepcopy eventually fails), which is when it runs __del__ and tries to access the .shp attribute. If you try it normally without try-except then it fails and raises an exception before the program has a chance to garbage collect. If you do it in a try-except clause, then the program is allowed to continue and the partially copied Reader instance gets garbage collected which is where it fails.

So in terms of a solution, I guess just doing a more selective check for the geopandas case around L76 (or dropping it entirely now that geopandas has fixed it) would do it? I'll probably also add to pyshp a try-except clause in __del__ for these rare cases where the Reader is attempted deepcopied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants