Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow searching on osm_key and osm_value #68

Closed
karussell opened this issue Jun 11, 2014 · 38 comments · Fixed by #141
Closed

Allow searching on osm_key and osm_value #68

karussell opened this issue Jun 11, 2014 · 38 comments · Fixed by #141
Milestone

Comments

@karussell
Copy link
Collaborator

There should be another filter parameter in the API to search in a specific area (bounding box) for a specific tag combination or even a set of combinations ('I want food').

For that osm_key and osm_value have to be indexed in the mapping. Or maybe we make this configurable if this increases size too much.

@sdole
Copy link
Contributor

sdole commented Sep 22, 2014

I have the same request. need the ability to filter based on multiple osm_key and osm_value (and also on absence of certain keys or values)

@sdole
Copy link
Contributor

sdole commented Sep 22, 2014

if someone provides me a hint where this code is, I can try forking and changing this code.

@christophlingg
Copy link
Member

you will want to extend the request handling taking the new query parameters into account. api request handling is done here: https://github.com/komoot/photon/blob/master/src/main/java/de/komoot/photon/importer/App.java#L118

changing how es is queried could be done here: https://github.com/komoot/photon/blob/master/src/main/java/de/komoot/photon/importer/elasticsearch/Searcher.java#L62

let me know if you have further question / problems!

good luck

@steelbrain
Copy link

Bump!

@sdole
Copy link
Contributor

sdole commented Jan 26, 2015

I am working on this issue. I added a few new PhotonDocs into the setup of the QueryStateTest. These new PhotonDocs have multiple arbitrary tags in the "extratags" constructor parameter. I expected these extratags to be in the result on line 52 - but none showed up. What am I doing wrong? Or - is the test index not setup to have the extra tags?

Please let me know if my question does not make sense.

Thanks

@sdole
Copy link
Contributor

sdole commented Jan 26, 2015

I committed an example on my fork. See comments on that commit: sdole/photon@16b0a33

@christophlingg
Copy link
Member

In extratags we save some additional data that is coming from nominatim's db column, here is an example place: http://nominatim.openstreetmap.org/details.php?place_id=167287581

We added it in the beginning but we have never used it so far.

We should be very careful about what information we store in the elasticsearch index. We have more than 100.000.000 documents, every little detail we add will have a big impact on the index size. The information in extratags are not important enough to be stored.

If you want to save this information for your needs, you need to adopt this util function: https://github.com/komoot/photon/blob/master/src/main/java/de/komoot/photon/Utils.java#L24 It defines how a photon doc is converted into a elasticsearch doc.

Btw: you do not need extratags to allow filtering on osm_key and osm_value

@karussell
Copy link
Collaborator Author

Maybe there is an option where one can enable this functionality?

@sdole
Copy link
Contributor

sdole commented Jan 27, 2015

My mistake. The correct questions I should have asked are:

  1. Where are osm tags stored in the ES data model? and
  2. Do you have suggestions on how I should formulate my es query for
    filtering on tags.

Thank you.
On Jan 27, 2015 2:42 AM, "Christoph Lingg" [email protected] wrote:

In extratags we save some additional data that is coming from nominatim's
db column, here is an example place:
http://nominatim.openstreetmap.org/details.php?place_id=167287581

We added it in the beginning but we have never used it so far.

We should be very careful about what information we store in the
elasticsearch index. We have more than 100.000.000 documents, every little
detail we add will have a big impact on the index size. The information in
extratags are not important enough to be stored.

If you want to save this information for your needs, you need to adopt
this util function:
https://github.com/komoot/photon/blob/master/src/main/java/de/komoot/photon/Utils.java#L24
It defines how a photon doc is converted into a elasticsearch doc.

Btw: you do not need extratags to allow filtering on osm_key and osm_value


Reply to this email directly or view it on GitHub
#68 (comment).

@christophlingg
Copy link
Member

every photon doc has a tag key and a tag value.

There is already an ES filter, you can extend it here: https://github.com/sdole/photon/blob/master/es/query.json#L70

you should not forget to adopt key and value in the index. otherwise you cannot query for them. see https://github.com/sdole/photon/blob/master/es/mappings.json#L19

@sdole
Copy link
Contributor

sdole commented Jan 27, 2015

I see. So, the ES data model has the "Type" tag from nominatim stored as "tagKey" and "tagValue"? Got it! Then, I don't need to filter on extratags. However, I have a question: what an element has more than one tag? What will be held in tagKey and tagValue?

I have another question: how do you feel about using es java api instead of the json syntax?

Thanks

@christophlingg
Copy link
Member

When one osm object has several types, nominatim creates multiple places. That means in photon we can have multiple docs with different tagKey/tagValue but with the same osm_id.

We opted to use json as it is easier to read and can be used by other implementations. We are using a small python implementation for developing but you can also use it with this chrome extension for example. By the way, this plugin is super helpful for creating a new filter!

@sdole
Copy link
Contributor

sdole commented Jan 27, 2015

Understood. I will checkout that plugin. Thanks.
On Jan 27, 2015 6:50 AM, "Christoph Lingg" [email protected] wrote:

When one osm object has several types, nominatim creates multiple places.
That means in photon we can have multiple docs with different
tagKey/tagValue but with the same osm_id.

We opted to use json as it is easier to read and can be used by other
implementations. We are using a small python implementation for developing
but you can also use it with this chrome extension
https://chrome.google.com/webstore/detail/sense-beta/lhjgkmllcaadmopgmanpapmpjgmfcfig?hl=en
for example. By the way, this plugin is super helpful for creating a new
filter!


Reply to this email directly or view it on GitHub
#68 (comment).

@sdole
Copy link
Contributor

sdole commented Jan 28, 2015

Christoph, I need more help. I am using that sense chrome extension from the link you sent. It shows me some data about photon index making me believe that I have ES going into the right index. Secondly, I debugged the photon code in my IDE to copy the json with replaced parameters and pasted that into the sense chrome extension. When I run this, I get the following error. What am I not doing right?

 "error": "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; 
shardFailures {[Q0gtkHdTSjehfECQPfNG5w][.marvel-2014.08.12][0]: SearchParseException[[
.marvel-2014.08.12][0]: from[-1],size[-1]: Parse Failure [Failed to parse source 

@yohanboniface
Copy link
Collaborator

@sdole can you paste the full query and the full traceback?

(Seems that you are missing the index to query on, but just guessing.)

@sdole
Copy link
Contributor

sdole commented Jan 28, 2015

Here is the query followed by the result

GET _search
{
  "filtered": {
    "query": {
      "function_score": {
        "boost_mode": "multiply",
        "query": {
          "bool": {
            "must": {
              "bool": {
                "should": [
                  {
                    "match": {
                      "collector.default": {
                        "fuzziness": 1,
                        "query": "berlin",
                        "minimum_should_match": "100%",
                        "analyzer": "search_ngram",
                        "prefix_length": 2
                      }
                    }
                  },
                  {
                    "match": {
                      "collector.en": {
                        "fuzziness": 1,
                        "query": "berlin",
                        "minimum_should_match": "100%",
                        "analyzer": "search_ngram",
                        "prefix_length": 2
                      }
                    }
                  }
                ],
                "minimum_should_match": 1
              }
            },
            "should": [
              {
                "match": {
                  "name.en.raw": {
                    "query": "berlin",
                    "boost": 200,
                    "analyzer": "search_raw"
                  }
                }
              },
              {
                "match": {
                  "collector.en.raw": {
                    "query": "berlin",
                    "boost": 100,
                    "analyzer": "search_raw"
                  }
                }
              }
            ]
          }
        },
        "score_mode": "multiply",
        "functions": [
          {
            "script_score": {
              "script": "general-score",
              "lang": "mvel"
            }
          }
        ]
      }
    },
    "filter": {
      "or": {
        "filters": [
          {
            "missing": {
              "field": "housenumber"
            }
          },
          {
            "query": {
              "match": {
                "housenumber": {
                  "query": "berlin",
                  "analyzer": "standard"
                }
              }
            }
          },
          {
            "exists": {
              "field": "name.en.raw"
            }
          }
        ]
      }
    }
  }
}

and result

{
   "error": "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[Q0gtkHdTSjehfECQPfNG5w][.marvel-2014.08.12][0]: SearchParseException[[.marvel-2014.08.12][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[.marvel-2014.08.12][0]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][.marvel-2014.08.13][0]: SearchParseException[[.marvel-2014.08.13][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[.marvel-2014.08.13][0]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][.marvel-2015.01.22][0]: SearchParseException[[.marvel-2015.01.22][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[.marvel-2015.01.22][0]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][.marvel-2015.01.26][0]: SearchParseException[[.marvel-2015.01.26][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[.marvel-2015.01.26][0]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][.marvel-2015.01.28][0]: SearchParseException[[.marvel-2015.01.28][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[.marvel-2015.01.28][0]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][photon][0]: SearchParseException[[photon][0]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[photon][0]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][photon][1]: SearchParseException[[photon][1]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[photon][1]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][photon][2]: SearchParseException[[photon][2]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[photon][2]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][photon][3]: SearchParseException[[photon][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[photon][3]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }{[Q0gtkHdTSjehfECQPfNG5w][photon][4]: SearchParseException[[photon][4]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\n  \"filtered\": {\n    \"query\": {\n      \"function_score\": {\n        \"boost_mode\": \"multiply\",\n        \"query\": {\n          \"bool\": {\n            \"must\": {\n              \"bool\": {\n                \"should\": [\n                  {\n                    \"match\": {\n                      \"collector.default\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  },\n                  {\n                    \"match\": {\n                      \"collector.en\": {\n                        \"fuzziness\": 1,\n                        \"query\": \"berlin\",\n                        \"minimum_should_match\": \"100%\",\n                        \"analyzer\": \"search_ngram\",\n                        \"prefix_length\": 2\n                      }\n                    }\n                  }\n                ],\n                \"minimum_should_match\": 1\n              }\n            },\n            \"should\": [\n              {\n                \"match\": {\n                  \"name.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 200,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              },\n              {\n                \"match\": {\n                  \"collector.en.raw\": {\n                    \"query\": \"berlin\",\n                    \"boost\": 100,\n                    \"analyzer\": \"search_raw\"\n                  }\n                }\n              }\n            ]\n          }\n        },\n        \"score_mode\": \"multiply\",\n        \"functions\": [\n          {\n            \"script_score\": {\n              \"script\": \"general-score\",\n              \"lang\": \"mvel\"\n            }\n          }\n        ]\n      }\n    },\n    \"filter\": {\n      \"or\": {\n        \"filters\": [\n          {\n            \"missing\": {\n              \"field\": \"housenumber\"\n            }\n          },\n          {\n            \"query\": {\n              \"match\": {\n                \"housenumber\": {\n                  \"query\": \"berlin\",\n                  \"analyzer\": \"standard\"\n                }\n              }\n            }\n          },\n          {\n            \"exists\": {\n              \"field\": \"name.en.raw\"\n            }\n          }\n        ]\n      }\n    }\n  }\n}\n]]]; nested: SearchParseException[[photon][4]: from[-1],size[-1]: Parse Failure [No parser for element [filtered]]]; }]",
   "status": 400
}

@yohanboniface
Copy link
Collaborator

I meant through a pastebin ;)

Try GET photon/place/_search instead of GET _search only.

@sdole
Copy link
Contributor

sdole commented Jan 28, 2015

Sorry! I pasted my new query with the photon/place/_search, it still does not work. the error is also on pastebin.

@yohanboniface
Copy link
Collaborator

You need a first query key: {"query": {"filtered"… (do not forget to close it at the end of the json of course).

@sdole
Copy link
Contributor

sdole commented Jan 28, 2015

ah! so what should the query json object look like if I am searching for the word "berlin"? I can look at documentation, but, I want to mimic what happens in photon as nearly as possible. thank you for this!

@sdole
Copy link
Contributor

sdole commented Jan 28, 2015

I think I found it. Can you please take a look at my latest query and tell me if it looks right? It does seem to match what I can get from photon for the same query.

@sdole
Copy link
Contributor

sdole commented Jan 28, 2015

I have not been able to make much progress on this one yet. I need to learn the query dsl. I will report back when I get this done. If you have suggestions or ideas for me to pursue, please let me know. Thanks for support so far.

@sdole
Copy link
Contributor

sdole commented Feb 5, 2015

It looks like the osm_key field is not indexed as per the mapping. If that is the case, how do we filter on the key? I am able to construct a filter for the osm_value, but not on osm_key. I suspect it is because osm_key is not indexed in the mapping. Do you concur?

               "osm_key": {
                  "type": "string",
                  "index": "no"
               },
               "osm_value": {
                  "type": "string"
               },

@christophlingg
Copy link
Member

yes, we need to index them before. that's what i meant here

@sdole
Copy link
Contributor

sdole commented Feb 5, 2015

Got it. Now, as I understand, this indexing can only be done at import
time. Correct? Is it possible for you to do the import/indexing and release
a new data file? I don't have nominatim in my environment. Otherwise, I
could test my changes with test data and commit without testing against
nominatim data.

Please suggest/advise.
On Feb 5, 2015 1:43 AM, "Christoph Lingg" [email protected] wrote:

yes, we need to index them before. that's what i meant here
#68 (comment)


Reply to this email directly or view it on GitHub
#68 (comment).

@christophlingg
Copy link
Member

I can send you some sample data if you want, but probably it is even easier to develop that feature with test cases. this is an example:

https://github.com/komoot/photon/blob/master/src/test/java/de/komoot/photon/elasticsearch/RegionalNameTest.java

This way, you can create some sample documents and write test for the filters.

@sdole
Copy link
Contributor

sdole commented Feb 5, 2015

Awesome. I will check that out and go with test data. Thanks.
On Feb 5, 2015 4:53 AM, "Christoph Lingg" [email protected] wrote:

I can send you some sample data if you want, but probably it is even
easier to develop that feature with test cases. this is an example:

https://github.com/komoot/photon/blob/master/src/test/java/de/komoot/photon/elasticsearch/RegionalNameTest.java

This way, you can create some sample documents and write test for the
filters.


Reply to this email directly or view it on GitHub
#68 (comment).

@sdole sdole mentioned this issue Feb 6, 2015
@sdole
Copy link
Contributor

sdole commented Feb 6, 2015

I created a pull request. This is my proposal of course, please let me know if this will not work for any reason or if you want me to improve this further before it can be used. Thanks!

@christophlingg christophlingg added this to the 0.2.1 milestone Feb 7, 2015
@sdole
Copy link
Contributor

sdole commented Feb 11, 2015

@christophlingg As soon as I started coding for using this new feature I realized that I am going to need multiple osm_key and osm_value instead of just one as I have coded. So, I might even need to get something like

  • all places that have key tourism AND all places that have key boundary AND value administrative
  • all places that have key tourism OR all places that have key boundary AND value administrative.
  • all places that have keys tourism, boundary, amenity and values attraction, administrative, coffee

I will have to think about this a little more and possibly do some more coding. I will first communicate my proposal to you and then code for it. I will also try to do the java API coding when I do this. I assume that you will see the value in this - not just for my use case, but possibly for others who might use this.

I will try to not make this very complex, for users of photon.

@christophlingg
Copy link
Member

I understand, in some more advanced scenarios you might want to filter on multiple tags. Thinking even more generic you might want to exclude some results, like every kind of object but no business stations and letter boxes...

that's just a quick idea I had:

?osm_tag=key:value&osm_tag=key&osm_tag=!key:value

Anyway it should be well thought

@sdole
Copy link
Contributor

sdole commented Feb 12, 2015

@christophlingg your idea looks great. I am starting to implement it.

@sdole
Copy link
Contributor

sdole commented Feb 23, 2015

Just a heads up, I am almost done with a fix for a new filter. I just want to test it with the latest data file. I will be opening up a new pull request someday soon. If you want to get a preview, please see the fork: https://github.com/sdole/photon

@christophlingg
Copy link
Member

Good to know!

@sdole
Copy link
Contributor

sdole commented Feb 28, 2015

Done. Please see pull request #146

@christophlingg
Copy link
Member

cool @sdole, your feature is up and running. works like a charm:

http://photon.komoot.de/api?q=innsbruck&osm_tag=aeroway:hangar

@sdole
Copy link
Contributor

sdole commented Mar 6, 2015

great to hear. I will watch for bug reports if anyone files any.

On Fri, Mar 6, 2015 at 7:11 AM, Christoph Lingg [email protected]
wrote:

cool @sdole https://github.com/sdole, your feature is up and running.
works like a charm:

http://photon.komoot.de/api?q=innsbruck&osm_tag=aeroway:hangar


Reply to this email directly or view it on GitHub
#68 (comment).

President | Genvega Inc. | www.genvega.com | tel: 630.290.2561

@amnesia7
Copy link
Contributor

@sdole I'm not sure if you'd know or not but I was wondering if you had come across a list of the tags/value/both that could be used to decide if a result is a venue or not (eg anything from a theatre to a football stadium).
I'm not sure how much this can be relied on with OSM data because some things are only distinguishable by individual osm values rather than the overarching key so I don't know if the list could be quite long.

@sdole
Copy link
Contributor

sdole commented Mar 20, 2016

I am sorry, I don't. I would have thought about it the same way you seem
to think about it i.e. make a list of tags/values that sound like a venue.
On Mar 20, 2016 7:09 AM, "amnesia7" [email protected] wrote:

@sdole https://github.com/sdole I'm not sure if you'd know or not but I
was wondering if you had come across a list of the tags/value/both that
could be used to decide if a result is a venue or not (eg anything from a
theatre to a football stadium).
I'm not sure how much this can be relied on with OSM data because some
things are only distinguishable by individual osm values rather than the
overarching key so I don't know if the list could be quite long.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#68 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants