Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@sanity/block-tools — support async deserializer rules #7697

Open
silvertech-daniel opened this issue Oct 29, 2024 · 3 comments
Open

@sanity/block-tools — support async deserializer rules #7697

silvertech-daniel opened this issue Oct 29, 2024 · 3 comments

Comments

@silvertech-daniel
Copy link
Contributor

With htmlToBlocks, I have a few deserializer rules where I want to do something async e.g. upload or query something to make a reference.

I'd like for deserialize to potentially be async. I know that it's a little tricky (especially with typescript) but it should be doable to make htmlToBlocks overload so that it returns a promise if any deserializer rules return promises.

I have a workaround where I return a placeholder __promise block and resolve it after the fact, but I also have to recursively resolve __decoration/__annotation blocks myself.

@christianhg
Copy link
Contributor

Hi there! Would you be up for providing a code example so I can get a better idea of the use case that requires this?

@silvertech-daniel
Copy link
Contributor Author

This is incomplete, but it's a workaround that I've been using. The idea is that I have certain deserializer rules that need to run async, e.g. uploading an image and doing some other work or looking up a document to reference for an internal link. My workaround is to do something similar to the existing __decoration/__annotation return values, where I return a __promise and recursively resolve all of them after the fact.

// existing
export declare function htmlToBlocks(
  html: string,
  blockContentType: ArraySchemaType,
  options?: HtmlDeserializerOptions,
): (TypedObject | PortableTextTextBlock)[]
const recursivelyResolvePromises = async (unprocessedItems: ReadonlyArray<TypedObject>) => {
  const items: Array<TypedObject> = []
  
  for (let item of unprocessedItems) {
    if (item._type === '__promise') {
      const awaited = await (
        item as { _key?: string; _type: '__promise'; promise: Promise<TypedObject> }
      ).promise
      if (!awaited) {
        continue
      }
      
      item = {
        ...awaited,
        _key: item._key ?? awaited._key ?? randomKey()
      }
    }
    
    if (item._type === 'block') {
      const block = item as PortableTextBlock
      if (block.children?.length) {
        block.children = await recursivelyResolvePromises(block.children)
      }
      if (block.markDefs?.length) {
        block.markDefs = (await recursivelyResolvePromises(block.markDefs)) as any
      }
    }
    
    items.push(item)
  }
  
  return items
}

const parseHtml = (html: string, url?: string | URL) =>
  new JSDOM(html, { url: url?.toString() }).window.document

const cleanedDocument = parseHtml(htmlString, url)
// ... preprocessing of cleanedDocument ...

const blocks = htmlToBlocks('', blockType, {
  parseHtml: () => cleanedDocument,
  rules: [
    {
      deserialize: (n, next, block) => {
        if (e.localName === 'a') {
          const a = e as HTMLAnchorElement
          
          return {
            _type: '__annotation',
            markDef: {
              _key: randomKey(),
              _type: '__promise',
              promise: linkTo(a, {
                baseUrl,
                scrapeTime
              })
            },
            children: next(a.childNodes)
          }
        }
        if (e.localName === 'img') {
          return block({
            _type: '__promise',
            promise: processImage(e as HTMLImageElement)
          })
        }
    }
  ]
})

const withResolvedPromises = await recursivelyResolvePromises(blocks)

return withResolvedPromises
  .map(block =>
    normalizeBlock(block, {
      allowedDecorators: [
        'strong',
        'emphasis',
        'underline',
        'strikeThrough',
        'highlight',
        'code'
      ]
    })
  )

If deserialize was allowed to be async, I could just return a promise. While I'm at it, I'll also include the ability to pass in pre-parsed html. What I want is roughly:

export declare function htmlToBlocks(
  html: string | Document, // edit to existing, parseHtml only used if html is a string
  blockContentType: ArraySchemaType,
  options?: HtmlDeserializerOptions,
): (TypedObject | PortableTextTextBlock)[]

// new async overload
export declare async function htmlToBlocks(
  html: string,
  blockContentType: ArraySchemaType,
  options?: AsyncHtmlDeserializerOptions,
): Promise<(TypedObject | PortableTextTextBlock)[]>

export declare interface AsyncHtmlDeserializerOptions extends HtmlDeserializerOptions {
  rules?: AsyncDeserializerRule[]
}

export declare interface AsyncDeserializerRule extends DeserializerRule {
  async deserialize: (
    el: Node,
    next: (elements: Node | Node[] | NodeList) => Promise<TypedObject | TypedObject[] | undefined>,
    createBlock: (props: ArbitraryTypedObject) => {
      _type: string
      block: ArbitraryTypedObject
    },
  ) => Promise<TypedObject | TypedObject[] | undefined>
}
const parseHtml = (html: string, url?: string | URL) =>
  new JSDOM(html, { url: url?.toString() }).window.document

const cleanedDocument = parseHtml(htmlString, url)
// ... preprocessing of cleanedDocument ...

return htmlToBlocks(cleanedDocument, blockType, {
  rules: [
    {
      async deserialize: (n, next, block) => {
        if (e.localName === 'a') {
          const a = e as HTMLAnchorElement
          
          return {
            _type: '__annotation',
            markDef: await linkTo(a, {
              baseUrl,
              scrapeTime
            }),
            children: await next(a.childNodes)
          }
        }
        if (e.localName === 'img') {
          return processImage(e as HTMLImageElement)
        }
    }
  ]
})

@christianhg
Copy link
Contributor

christianhg commented Nov 7, 2024

This is great. Thanks a lot. Your workaround looks sensible, so I'm happy you at least have something working. I'll ask around if async deserialisers is something we'd like to support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants