Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recovery issues when using multiple entries in path.data #18217

Closed
djschny opened this issue May 9, 2016 · 2 comments
Closed

recovery issues when using multiple entries in path.data #18217

djschny opened this issue May 9, 2016 · 2 comments
Labels
discuss :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source.

Comments

@djschny
Copy link
Contributor

djschny commented May 9, 2016

Elasticsearch version: 2.3.2

JVM version: 1.8.0_74

OS version: OSX 10.11.4

Description of the problem including expected versus actual behavior:
When using multiple entries for path.data on a node and then simulating removing a drive from the node, the node does not mark the shards that were on that shard as unavailable so re-assignment can happen. I would expect to see some initial errors get thrown, but after a little bit of time, see the shards that were assigned to the missing or timing out filesystem to be allocated elsewhere in the cluster (or the same node but a different data path).

Steps to reproduce:

  1. Create four empty directories
  2. Create a new index (using default 5/1 shard allocation) and index some data
  3. Start up two nodes of ES, each one pointing to two different data directories. For example:
bin/elasticsearch --node.name=node1 --path.data=/data/node1-drive1,/data/node1-drive2
bin/elasticsearch --node.name=node2 --path.data=/data/node2-drive1,/data/node2-drive2
  1. Remove one of the directories completely

Provide logs (if relevant):

{
  "_index": "index1",
  "_type": "docs",
  "_id": "AVSWUC2X9feFc7L66p0Q",
  "_version": 1,
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 1,
    "failures": [
      {
        "_index": "index1",
        "_shard": 4,
        "_node": "X4nxoaPZTMqX-RCWBM79Ew",
        "reason": {
          "type": "create_failed_engine_exception",
          "reason": "Create failed for [docs#AVSWUC2X9feFc7L66p0Q]",
          "shard": "4",
          "index": "index1",
          "caused_by": {
            "type": "no_such_file_exception",
            "reason": "/data/node1-drive1/elasticsearch/nodes/0/indices/index1/4/index/write.lock"
          }
        },
        "status": "INTERNAL_SERVER_ERROR",
        "primary": false
      }
    ]
  },
  "created": true
}
[2016-05-09 11:40:24,812][DEBUG][action.admin.indices.stats] [node1] [indices:monitor/stats] failed to execute operation for shard [[.kibana][0], node[X4nxoaPZTMqX-RCWBM79Ew], [P], v[4], s[STARTED], a[id=hTbB0MAbTXCZuOSwh_fArA]]
ElasticsearchException[failed to refresh store stats]; nested: NoSuchFileException[/data/node1-drive1/elasticsearch/nodes/0/indices/.kibana/0/index];
    at org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1532)
    at org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1517)
    at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55)
    at org.elasticsearch.index.store.Store.stats(Store.java:293)
    at org.elasticsearch.index.shard.IndexShard.storeStats(IndexShard.java:702)
    at org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:134)
    at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:165)
    at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:47)
    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.onShardOperation(TransportBroadcastByNodeAction.java:420)
    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:399)
    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:386)
    at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33)
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
    at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.file.NoSuchFileException: /data/node1-drive1/elasticsearch/nodes/0/indices/.kibana/0/index
    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:407)
    at java.nio.file.Files.newDirectoryStream(Files.java:457)
    at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:191)
    at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:203)
    at org.elasticsearch.index.store.FsDirectoryService$1.listAll(FsDirectoryService.java:127)
    at org.apache.lucene.store.FilterDirectory.listAll(FilterDirectory.java:57)
    at org.apache.lucene.store.FilterDirectory.listAll(FilterDirectory.java:57)
    at org.elasticsearch.index.store.Store$StoreStatsCache.estimateSize(Store.java:1538)
    at org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1530)
    ... 17 more

[2016-05-09 12:22:03,174][WARN ][cluster.action.shard     ] [node1] [index1][1] received shard failed for target shard [[index1][1], node[X4nxoaPZTMqX-RCWBM79Ew], [R], v[17860], s[INITIALIZING], a[id=Og_FzyceRPK2Dt-U0w8SJw], unassigned_info[[reason=ALLOCATION_FAILED], at[2016-05-09T16:22:03.137Z], details[failed to create shard, failure ElasticsearchException[failed to create shard]; nested: AccessControlException[access denied ("java.io.FilePermission" "/data" "read")]; ]]], indexUUID [OlAoAb1cRHydJ-P8CNZFHw], message [failed to create shard], failure [ElasticsearchException[failed to create shard]; nested: AccessControlException[access denied ("java.io.FilePermission" "/data" "read")]; ]
[index1][[index1][1]] ElasticsearchException[failed to create shard]; nested: AccessControlException[access denied ("java.io.FilePermission" "/data" "read")];
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:371)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:601)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:501)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:166)
    at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:610)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:772)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.security.AccessControlException: access denied ("java.io.FilePermission" "/data" "read")
    at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
    at java.security.AccessController.checkPermission(AccessController.java:884)
    at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
    at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
    at sun.nio.fs.UnixPath.checkRead(UnixPath.java:795)
    at sun.nio.fs.UnixFileSystemProvider.checkAccess(UnixFileSystemProvider.java:290)
    at java.nio.file.Files.createDirectories(Files.java:746)
    at org.elasticsearch.index.store.FsDirectoryService.newDirectory(FsDirectoryService.java:85)
    at org.elasticsearch.index.store.Store.<init>(Store.java:123)
    at org.elasticsearch.index.store.Store.<init>(Store.java:118)
    at sun.reflect.GeneratedConstructorAccessor9.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:50)
    at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86)
    at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:104)
    at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:47)
    at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:887)
    at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:43)
    at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:59)
    at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:46)
    at org.elasticsearch.common.inject.SingleParameterInjector.inject(SingleParameterInjector.java:42)
    at org.elasticsearch.common.inject.SingleParameterInjector.getAll(SingleParameterInjector.java:66)
    at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:85)
    at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:104)
    at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:47)
    at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:887)
    at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:43)
    at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:59)
    at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:46)
    at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:201)
    at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:193)
    at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:880)
    at org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:193)
    at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:175)
    at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:110)
    at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:162)
    at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:55)
    at org.elasticsearch.index.IndexService.createShard(IndexService.java:369)
    ... 10 more
@clintongormley clintongormley added discuss :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. labels May 9, 2016
@clintongormley
Copy link
Contributor

Until we get rid of guice, I'm not sure there is anything we can do to fix this.

@clintongormley
Copy link
Contributor

Closing in favour of #18279

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source.
Projects
None yet
Development

No branches or pull requests

2 participants