Skip to content

[Rest High Level Client]:Error parse when document Id start with '/' #34433

@roycarser

Description

@roycarser

Describe the feature:

Elasticsearch version :6.4.0

Plugins installed: []

JVM version :1.8

OS version:win10

Rest High Level Client Version:6.4.2

Description of the problem including expected versus actual behavior:
The rest-high-level-client make mistakes when a document id start with '/'

Steps to reproduce:

       RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost", 9200, "http")));

        /*the expected id is /abc/error_id*/
        GetRequest request = new GetRequest("index", "type", "/abc/error_id");
        GetResponse rsp = client.get(request, RequestOptions.DEFAULT);
        /*the actual id is error_id*/
        System.out.println(rsp.getId());
      RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost", 9200, "http")));

        /*the expected id is /id*/
        GetRequest request = new GetRequest("index", "type", "/id");
        GetResponse rsp = client.get(request, RequestOptions.DEFAULT);

       you will get a exception in this case:

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
	at java.lang.String.substring(String.java:1931)
	at org.elasticsearch.client.RequestConverters$EndpointBuilder.encodePart(RequestConverters.java:1532)
	at org.elasticsearch.client.RequestConverters$EndpointBuilder.addPathPart(RequestConverters.java:1503)
	at org.elasticsearch.client.RequestConverters.endpoint(RequestConverters.java:1150)
	at org.elasticsearch.client.RequestConverters.getStyleRequest(RequestConverters.java:463)
	at org.elasticsearch.client.RequestConverters.get(RequestConverters.java:459)
	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1252)
	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1231)
	at org.elasticsearch.client.RestHighLevelClient.get(RestHighLevelClient.java:416)
	at EsTest.main(EsTest.java:20)
RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost", 9200, "http")));

        /*the expected id is ///error_id*/
        GetRequest request = new GetRequest("index", "type", "///error_id");
        GetResponse rsp = client.get(request, RequestOptions.DEFAULT);
        /*the actual id is /error_id*/
        System.out.println(rsp.getId());
     RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost", 9200, "http")));

        /*the expected id is ///error_id*/
        GetRequest request = new GetRequest("index", "type", "/-//error_id");
        GetResponse rsp = client.get(request, RequestOptions.DEFAULT);

       you will get a exception in this case:

java.net.URISyntaxException: Illegal character in hostname at index 2: //-//error_id
	at java.net.URI$Parser.fail(URI.java:2848)
	at java.net.URI$Parser.parseHostname(URI.java:3387)
	at java.net.URI$Parser.parseServer(URI.java:3236)
	at java.net.URI$Parser.parseAuthority(URI.java:3155)
	at java.net.URI$Parser.parseHierarchical(URI.java:3097)
	at java.net.URI$Parser.parse(URI.java:3063)
	at java.net.URI.<init>(URI.java:673)
	at org.elasticsearch.client.RequestConverters$EndpointBuilder.encodePart(RequestConverters.java:1530)

Reason:
In the function encodePart(RequestConverter.java)

private static String encodePart(String pathPart) {
            try {
                //encode each part (e.g. index, type and id) separately before merging them into the path
                //we prepend "/" to the path part to make this pate absolute, otherwise there can be issues with
                //paths that start with `-` or contain `:`
---->  bug code:  URI uri = new URI(null, null, null, -1, "/" + pathPart, null, null);
                //manually encode any slash that each part may contain
                return uri.getRawPath().substring(1).replaceAll("/", "%2F");
            } catch (URISyntaxException e) {
                throw new IllegalArgumentException("Path part [" + pathPart + "] couldn't be encoded", e);
            }
        }

a URI look likes:[scheme:][//authority][path][?query][#fragment]
when you set pathPart to something start with '/',for example /abc, then call method "new URI(null, null, null, -1, "//abc", null, null);" .the "//abc" will be parse as "authority" because "//" stand for authority!,uri.getRawPath() would return null in this situation.

when you set pathPart to something like '/abc/def,then call method "new URI(null, null, null, -1, "//abc/def", null, null);",“//abc“ will be parse as “authority” and “/def” will be parse as path, uri.getRawPath() would return "def"

A solution to this problem is to add “///” when partPart startwith '/'

new URI(null, null, null, -1, pathPart.startsWith("/") ? "///"+pathPart : "/" + pathPart, null, null)

if that's ok, i'm gald to provide a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions