-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request - State value returned on query #102
Comments
I attempted to make the adjustments by looking at this commit as an example. After making modifications, the Looking deeper, and finding NominatimConnector.java, I now feel as though the I've created a fork, if you're interested in taking a look. Java is definitely not my strong suit, so I'll hold off on the pull request. |
I see that state attribute is necessary for the US, I'll have a look on it. |
Awww, I was hoping this would get the 1.1 Milestone tag :P Either way, I'm glad to see you working on photon. It's been very helpful :D |
@lonvia: we are using nominatim's get_addressdata to annotate a placex entry with additional information (like city, street, postcode). Could you give us a hint on how to extract the state information? see also AddressRow model |
@lonvia: btw, @daryltucker was already working on it: https://github.com/daryltucker/photon/blob/d8dc67b8546d4a5862bdfb1d532e2d8a4135934d/src/main/java/de/komoot/photon/importer/nominatim/model/AddressRow.java#L30 maybe this is already the way to go? |
@christophlingg the patch solves half of the problem. Like is_city(), it should also check for boundaries with the correct admin_level (it's 4 for state IIRC). |
thanks for your help! yes it should be 4. the wiki page says 8 but I remember there was a special nominatim mapping that doubled the values. |
Jup, rank_search and rank_address double admin_levels. admin_level contains the original values as it was mapped. |
he @daryltucker , i created a new branch and committed all necessary changes. Do you have time for running a test import and verify that results are as expected? |
@christophlingg Exciting news! I'll be able to run some tests in the morning. Thanks again! |
I checked the state feature with my local dataset – only iceland though. It was working as expected, however we should make a world-wide test to see if we the state property is set right in all countries. Photon has a json export feature: we can export nominatim data to a json file which can then be used to import it on other machines. I do not have a world-wide nominatim setup, but I could import this json data on photon.komot.de (on a different port) and we can evaluate the data together. @daryltucker do you have nominatim running with world-wide data, otherwise maybe @lonvia could help us out - once more. I am also curious how her changes improved the performance. I uploaded a special jar for this branch: http://photon.komoot.de/data/photon-0.2-SNAPSHOT.jar json export works like this:
|
@christophlingg I can do another export. In fact, I've already one lying around done with the performance improvements (just still have a logistics problem when it comes to 5GB files). Export now takes somewhat around a day, much better than before. I'll redo the export with the states and see to it that I put it somewhere accessible next week. |
great! I can get you an account on photon.komoot.de, so you could transfer it via scp. If that helps? |
Doesn't help much, I'm afraid. The problem is on the other side. It's called Telekom. |
@christophlingg My instance is Global OSM, yes. Plans on testing over the weekend. |
@daryltucker I'll leave the creation of the dump to you then. |
Thanks for providing the special snapshot. I'll provide the json file when it's complete. I'll send you a key for scp (via email). |
After an hour, I have no additional logging shown on my terminal and the Should I keep going? It is slowing down the server tremendously. |
It starts with sequentially scanning placex, which takes some time and server resources. It should be finished soon with that and start to show some progress. |
Good news :D |
Compressing and sending your way very shortly. |
👍 |
Any word/updates? |
I checked if for places in Austria, Germany and USA and results were as expected. |
@christophlingg I assume this is different to the change that will add the county to the results for UK searches for #120? |
Woohoo! I am super-excited to get to looking at this new feature! |
I checked various countries and the state attribute was always set right. As far as this is concerned I would not hesitate to make a release with this new feature. However, I investigated what @amnesia7 was reporting, I rewrote some parts of the importer and I unfortunately a bug was introduced. Sometimes Nominatim marks several relations being a city for a place. In the case of trafalgar square it is Westminster and Greater London. Now Westminster is mistakenly assigned as city of all places in London. This is a data bug we don't want to have in the next release. I don't really how this wrong assignment could have happened, I rewrote the code slightly but it has not changed: before -> after Maybe @lonvia can help us here? I knew we talked about this problem when we implemented the code in Berlin. Maybe due to the speed improvements the ordering of the address lines has changed and that's why the last address row is not the right city any more? It's a pity that we encounter this bug short before getting finished ;-( Probably we need to fix the bug and export the json just another time. |
@christophlingg We should be able to fix all that by looking into extratags->place and if it says city or state prefer that over the one we determine from the address_rank. Apparently, I had forgotten that when changing the importer. I can have a go at implementing it in theory but don't have the means to test it at the moment. If I make a pull request, can you take over and check if it fixes all the issues? |
Would be great if you could commit a pfix for that. You know the nominatim internals best. I don't have any testing data at the moment, but we will find ways to test it. I am still wondering why it worked previously. We did not consider extratags before so maybe it worked only by luck... |
Previously we used the internal address computation which already does take the place tag into account. |
I created a new snapshot version of photon that has lonvia's changes: http://photon.komoot.de/data/photon-0.2-SNAPSHOT.jar For a new evaluation round (and a release) I need the help of someone with a global nominatim database. @lonvia , @daryltucker or @FelixWittmann do you have time and resources for a json export?
|
Heck yeah! I'll get to testing this on Friday. Are you interested in the JSON export this time, too? |
This would be great! That's the missing piece to get this feature released. And I don't have a global nominatim database myself. Happy new year btw, here in Europe we are very close ;-) |
It has begun! I'll let you know once it's available to you guys. |
great! |
You guys should see |
can you let me know the complete path? cannot find it in your home directory |
the new data is indexed and available on port 2333 ich checked some place in Austria and Germany, unfortunately I found a wrong assignment in munich. e.g. http://photon.komoot.de:2333/api?q=pizza+call+dachauer And also example @amnesia7 reported is not fixed: http://photon.komoot.de:2333/api?q=trafalgar%20square%20london It seems like the changes we made are not working. Quite disappointing ;-( @daryltucker are you sure you were using the 0.2-SNAPSHOT version? |
one positive thing: i fixed a bug in the schema which apparently reduced the index size from 66 to 46 GB |
There was a copy'n'paste error in my pull request. In this line the binarySearch should be against 'place' not 'osmValue'. I hope it fixes the problem. |
@lonvia, it fixed the problem in deed! I created a bunch of test cases to double check it. All known cases are fixed and tested. I think we can dare another json export, @daryltucker are you still available? I uploaded the latest jar to http://photon.komoot.de/data/photon-0.2-SNAPSHOT.jar |
Running JSON Export
|
Compressing
|
great, let me know when it's on the server! |
The above compression did not end up with a proper bz2 file. Trying the following:
|
import is running, the first documents are popping up: http://photon.komoot.de:2333/api?q=paris this will take now 2-3 hours |
the data is great, I could not find a bug anymore. I'll shut down this instance and will prepare a release! 👯 |
I just released the new version, photon.komoot.de is now running the new version and exposes the state information! Without the help of @daryltucker and @lonvia this would not have been possible, great! |
@christophlingg I'm not sure if this is fallout of the state change or not but I don't think the sydney opera house is in antarctica: http://photon.komoot.de/api/?q=sydney%20opera%20house |
The Antarctica thing is a data problem, people keep claiming Antarctica for their country. Nominatim tends to turn facts around in this case and claim the country for Antarctica. There is a workaround for that in the code somewhere but I'm not sure if @daryltucker has a recent enough version. |
I would like to see more of OSM's 'address' information returned upon querying photon. I specifically need the
state
field.Here are two examples that would satisfy my needs:
The text was updated successfully, but these errors were encountered: