Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] java.lang.RuntimeException: Failed to find latest snapshot id #3678

Open
1 of 2 tasks
skdfeitian opened this issue Jul 5, 2024 · 1 comment
Open
1 of 2 tasks
Labels
bug Something isn't working

Comments

@skdfeitian
Copy link

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

0.8

Compute Engine

flink 1.17.1

Minimal reproduce step

java.lang.RuntimeException: Failed to find latest snapshot id
at org.apache.paimon.utils.SnapshotManager.latestSnapshotId(SnapshotManager.java:143)
at org.apache.paimon.utils.SnapshotManager.latestSnapshot(SnapshotManager.java:131)
at org.apache.paimon.operation.FileStoreCommitImpl.tryCommit(FileStoreCommitImpl.java:623)
at org.apache.paimon.operation.FileStoreCommitImpl.commit(FileStoreCommitImpl.java:294)
at org.apache.paimon.table.sink.TableCommitImpl.commitMultiple(TableCommitImpl.java:192)
at org.apache.paimon.flink.sink.StoreCommitter.commit(StoreCommitter.java:100)
at org.apache.paimon.flink.sink.CommitterOperator.commitUpToCheckpoint(CommitterOperator.java:159)
at org.apache.paimon.flink.sink.CommitterOperator.notifyCheckpointComplete(CommitterOperator.java:152)
at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:104)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.notifyCheckpointComplete(RegularOperatorChain.java:145)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpoint(SubtaskCheckpointCoordinatorImpl.java:467)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:400)
at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:1430)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$16(StreamTask.java:1371)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$19(StreamTask.java:1410)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:494)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1737)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1753)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1750)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1765)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1760)
at org.apache.paimon.fs.hadoop.HadoopFileIO.exists(HadoopFileIO.java:110)
at org.apache.paimon.utils.SnapshotManager.findLatest(SnapshotManager.java:391)
at org.apache.paimon.utils.SnapshotManager.latestSnapshotId(SnapshotManager.java:141)
... 27 more

What doesn't meet your expectations?

Consuming Kafka data and writing it to HDFS files via Flink SQL. Occasionally, for some high-traffic topics, this error occurs. Please provide guidance on how to troubleshoot and resolve this issue.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@skdfeitian skdfeitian added the bug Something isn't working label Jul 5, 2024
@xuzifu666
Copy link
Contributor

xuzifu666 commented Jul 8, 2024

@skdfeitian From the error stack,this issue due to FileSystem cache expired. Maybe you can set fs.hdfs.impl.disable.cache=true in client or hdfs conf to reslove the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants