Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gatsby): use lmdb.removeSync so getNode can't return deleted nodes #33554

Merged
merged 6 commits into from
Oct 20, 2021

Conversation

KyleAMathews
Copy link
Contributor

@KyleAMathews KyleAMathews commented Oct 16, 2021

lmdb.remove() is asynchronous so getNode can still return deleted nodes for a time until the delete operation completes.

Switching to removeSync fixes this (at the cost of some amount of perf penalty).

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Oct 16, 2021
@KyleAMathews KyleAMathews added type: bug An issue or pull request relating to a bug in Gatsby and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Oct 16, 2021
@KyleAMathews KyleAMathews changed the title fix(gatsby): use lmdb.removeSync so getNode can't return deleted nodes lmdb.remove() is asynchronous so getNode can still return deleted nodes for a time. fix(gatsby): use lmdb.removeSync so getNode can't return deleted nodes Oct 16, 2021
@vladar
Copy link
Contributor

vladar commented Oct 19, 2021

Hmm, something is not right with this PR - it includes commits that were already merged to master?

@wardpeet
Copy link
Contributor

@vladar fixed :)

Comment on lines 137 to 143
.map(nodeId => {
if (preSyncDeletedNodeIdsCache.has(nodeId)) {
return undefined
}

return getNode(nodeId)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant as getNode has the same check now.

Suggested change
.map(nodeId => {
if (preSyncDeletedNodeIdsCache.has(nodeId)) {
return undefined
}
return getNode(nodeId)
})
.map(nodeId =>getNode(nodeId)!)

const stats = getDatabases().nodes.getStats()
// @ts-ignore
return Number(stats.entryCount || 0) // FIXME: add -1 when restoring shared structures key
const stats = getDatabases().nodes.getStats() as { entryCount: number }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 225 to 227
lastOperationPromise.then(() => {
preSyncDeletedNodeIdsCache.clear()
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can lead to a race condition when some DELETE_NODE is called while the transaction is being submitted by lmdb in another thread. Ideally, if clear is pending and DELETE_NODE happens we should cancel the previous clear attempt.

Pretty edge case, but it will be very hard to debug if we actually hit it.

vladar
vladar previously approved these changes Oct 19, 2021
Copy link
Contributor

@vladar vladar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢


if (action.type === `DELETE_NODE`) {
preSyncDeletedNodeIdsCache.add(
((action as IDeleteNodeAction).payload as IGatsbyNode).id
Copy link
Contributor

@pieh pieh Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://app.circleci.com/pipelines/github/gatsbyjs/gatsby/76288/workflows/e843318b-e1c5-4cdf-b155-b82bf562e44e/jobs/898140 shows cases where DELETE_NODE payload might be falsy:

error Cannot read property 'id' of undefined


  TypeError: Cannot read property 'id' of undefined
  
  - lmdb-datastore.ts:218 updateDataStore
    [artifacts]/[gatsby]/src/datastore/lmdb/lmdb-datastore.ts:218:66
  
  - lmdb-datastore.ts:259 callback
    [artifacts]/[gatsby]/src/datastore/lmdb/lmdb-datastore.ts:259:7
  
  - mett.ts:50 forEach
    [artifacts]/[gatsby]/src/utils/mett.ts:50:11
  
  - Set.forEach
  
  - mett.ts:49 Object.emit
    [artifacts]/[gatsby]/src/utils/mett.ts:49:17
  
  - index.ts:153 
    [artifacts]/[gatsby]/src/redux/index.ts:153:11

so we should check if payload is truthy before reaching to id. We do checks like that in other places handling DELETE_NODE action - for example in

case `DELETE_NODE`: {
if (action.payload) {
return nodesDb.remove(action.payload.id)
}
return false
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also casting payload as IGatsbyNode pretty much removes information that this might be falsy as our action type does mention it:

export interface IDeleteNodeAction {
type: `DELETE_NODE`
// FIXME: figure out why payload can be undefined here
payload: IGatsbyNode | void
}

I really think we should avoid casts like this. In fact none of casts in this line are needed because after checking for type action already become that specific type in closure below

@wardpeet wardpeet merged commit 98a843c into master Oct 20, 2021
@wardpeet wardpeet deleted the fix-delete branch October 20, 2021 11:34
vladar pushed a commit that referenced this pull request Oct 22, 2021
#33554)

Co-authored-by: Ward Peeters <[email protected]>
(cherry picked from commit 98a843c)

# Conflicts:
#	packages/gatsby/src/datastore/lmdb/lmdb-datastore.ts
vladar added a commit that referenced this pull request Oct 22, 2021
#33554) (#33633)

Co-authored-by: Ward Peeters <[email protected]>
(cherry picked from commit 98a843c)

# Conflicts:
#	packages/gatsby/src/datastore/lmdb/lmdb-datastore.ts

Co-authored-by: Kyle Mathews <[email protected]>
wardpeet added a commit to herecydev/gatsby that referenced this pull request Oct 29, 2021
axe312ger pushed a commit that referenced this pull request Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants