Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix shadow topics cannot be consumed when the entry is not cached #33

Closed
wants to merge 4 commits into from

Conversation

BewareMyPower
Copy link
Owner

@BewareMyPower BewareMyPower commented Aug 9, 2024

Motivation

For shadow topics, a ReadOnlyLedgerHandle is created to read messages from the source topic when the entry is not cached. However, it leverages the readAsync API that validates the lastAddConfirmed field (LAC). In ReadOnlyLedgerHandle, this field could never be updated, so readAsync could fail immediately. See LedgerHandle#readAsync:

if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}

This bug is not exposed because:

  1. PulsarMockReadHandle does not maintain a LAC field.
  2. The case for cache miss is never tested.

Modifications

Replace readAsync with readUnconfirmedAsync and compare the entry range with the ManagedLedger#getLastConfirmedEntry. The managed ledger already maintains a lastConfirmedEntry to limit the last entry. See ManagedLedgerImpl#internalReadFromLedger:

Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();

Add ShadowTopicRealBkTest to cover two code changes RangeEntryCacheImpl#readFromStorage and EntryCache#asyncReadEntry.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because ReadOnlyManagedLedgerImpl could read a ledger in another managed ledger.

…not cached

### Motivation

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages
from the source topic when the entry is not cached. However, it
leverages the `readAsync` API that validates the `lastAddConfirmed`
field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated,
so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

### Modifications

Replace `readAsync` with `readUnconfirmedAsync`. The managed ledger
already maintains a `lastConfirmedEntry` to limit the last entry. See
`ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes
`RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.
@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-shadow-topic-read branch from d89fbf2 to 0d143a9 Compare August 12, 2024 10:53
hangc0276 pushed a commit to apache/pulsar that referenced this pull request Aug 15, 2024
…not cached (#23147)

### Motivation

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages from the source topic when the entry is not cached. However, it leverages the `readAsync` API that validates the `lastAddConfirmed` field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated, so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

### Modifications

Replace `readAsync` with `readUnconfirmedAsync` and compare the entry range with the `ManagedLedger#getLastConfirmedEntry`. The managed ledger already maintains a `lastConfirmedEntry` to limit the last entry. See `ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes `RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because `ReadOnlyManagedLedgerImpl` could read a ledger in another managed ledger.

### Documentation

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

### Matching PR in forked repository

PR in forked repository: BewareMyPower#33

<!--
After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the 
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in 
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.
-->
hangc0276 pushed a commit to apache/pulsar that referenced this pull request Aug 15, 2024
…not cached (#23147)

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages from the source topic when the entry is not cached. However, it leverages the `readAsync` API that validates the `lastAddConfirmed` field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated, so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

Replace `readAsync` with `readUnconfirmedAsync` and compare the entry range with the `ManagedLedger#getLastConfirmedEntry`. The managed ledger already maintains a `lastConfirmedEntry` to limit the last entry. See `ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes `RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because `ReadOnlyManagedLedgerImpl` could read a ledger in another managed ledger.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#33

<!--
After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.
-->

(cherry picked from commit 15b88d2)
lhotari pushed a commit to apache/pulsar that referenced this pull request Aug 15, 2024
…not cached (#23147)

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages from the source topic when the entry is not cached. However, it leverages the `readAsync` API that validates the `lastAddConfirmed` field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated, so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

Replace `readAsync` with `readUnconfirmedAsync` and compare the entry range with the `ManagedLedger#getLastConfirmedEntry`. The managed ledger already maintains a `lastConfirmedEntry` to limit the last entry. See `ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes `RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because `ReadOnlyManagedLedgerImpl` could read a ledger in another managed ledger.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#33

<!--
After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.
-->

(cherry picked from commit 15b88d2)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 16, 2024
…not cached (apache#23147)

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages from the source topic when the entry is not cached. However, it leverages the `readAsync` API that validates the `lastAddConfirmed` field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated, so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

Replace `readAsync` with `readUnconfirmedAsync` and compare the entry range with the `ManagedLedger#getLastConfirmedEntry`. The managed ledger already maintains a `lastConfirmedEntry` to limit the last entry. See `ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes `RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because `ReadOnlyManagedLedgerImpl` could read a ledger in another managed ledger.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#33

<!--
After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.
-->

(cherry picked from commit 15b88d2)
(cherry picked from commit 14b3672)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 16, 2024
…not cached (apache#23147)

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages from the source topic when the entry is not cached. However, it leverages the `readAsync` API that validates the `lastAddConfirmed` field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated, so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

Replace `readAsync` with `readUnconfirmedAsync` and compare the entry range with the `ManagedLedger#getLastConfirmedEntry`. The managed ledger already maintains a `lastConfirmedEntry` to limit the last entry. See `ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes `RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because `ReadOnlyManagedLedgerImpl` could read a ledger in another managed ledger.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#33

<!--
After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.
-->

(cherry picked from commit 15b88d2)
(cherry picked from commit 14b3672)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 20, 2024
…not cached (apache#23147)

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages from the source topic when the entry is not cached. However, it leverages the `readAsync` API that validates the `lastAddConfirmed` field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated, so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

Replace `readAsync` with `readUnconfirmedAsync` and compare the entry range with the `ManagedLedger#getLastConfirmedEntry`. The managed ledger already maintains a `lastConfirmedEntry` to limit the last entry. See `ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes `RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because `ReadOnlyManagedLedgerImpl` could read a ledger in another managed ledger.

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

PR in forked repository: BewareMyPower#33

<!--
After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.
-->

(cherry picked from commit 15b88d2)
(cherry picked from commit 14b3672)
grssam pushed a commit to grssam/pulsar that referenced this pull request Sep 4, 2024
…not cached (apache#23147)

### Motivation

For shadow topics, a `ReadOnlyLedgerHandle` is created to read messages from the source topic when the entry is not cached. However, it leverages the `readAsync` API that validates the `lastAddConfirmed` field (LAC). In `ReadOnlyLedgerHandle`, this field could never be updated, so `readAsync` could fail immediately. See `LedgerHandle#readAsync`:

```java
if (lastEntry > lastAddConfirmed) {
    LOG.error("ReadAsync exception on ledgerId:{} firstEntry:{} lastEntry:{} lastAddConfirmed:{}",
            ledgerId, firstEntry, lastEntry, lastAddConfirmed);
    return FutureUtils.exception(new BKReadException());
}
```

This bug is not exposed because:
1. `PulsarMockReadHandle` does not maintain a LAC field.
2. The case for cache miss is never tested.

### Modifications

Replace `readAsync` with `readUnconfirmedAsync` and compare the entry range with the `ManagedLedger#getLastConfirmedEntry`. The managed ledger already maintains a `lastConfirmedEntry` to limit the last entry. See `ManagedLedgerImpl#internalReadFromLedger`:

```java
Position lastPosition = lastConfirmedEntry;

if (ledger.getId() == lastPosition.getLedgerId()) {
    lastEntryInLedger = lastPosition.getEntryId();
```

Add `ShadowTopicRealBkTest` to cover two code changes `RangeEntryCacheImpl#readFromStorage` and `EntryCache#asyncReadEntry`.

Exceptionally, compare the entry range with the LAC of a ledger handle when it does not exist in the managed ledger. It's because `ReadOnlyManagedLedgerImpl` could read a ledger in another managed ledger.

### Documentation

<!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->

- [ ] `doc` <!-- Your PR contains doc changes. -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

### Matching PR in forked repository

PR in forked repository: BewareMyPower#33

<!--
After opening this PR, the build in apache/pulsar will fail and instructions will
be provided for opening a PR in the PR author's forked repository.

apache/pulsar pull requests should be first tested in your own fork since the 
apache/pulsar CI based on GitHub Actions has constrained resources and quota.
GitHub Actions provides separate quota for pull requests that are executed in 
a forked repository.

The tests will be run in the forked repository until all PR review comments have
been handled, the tests pass and the PR is approved by a reviewer.
-->
@BewareMyPower BewareMyPower deleted the bewaremypower/fix-shadow-topic-read branch September 21, 2024 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant