Add a field to the API endpoint /api/pages/{id} to get the raw html #4310

deleyva · 2023-06-15T10:36:35Z

API Endpoint or Feature

Add a field (raw_html) to the API endpoint /api/pages/{id} to get the raw html (with includes ids on it, not the replaced html)

Use-Case

I am creating a database to keep track of reused content so, if anyone deletes that content, he gets notified that the content that he's about to delete is reused in a given page.

As this tracking is not in the roadmap, I'm building this in a django external system, but I need that field in the endpoint to get the data I need.

Additional context

No response

ssddanbrown · 2023-06-15T11:38:26Z

This is something we should support otherwise it's not possible to reliably create an external page editor, or even a proper fetch + update flow, without messing up include tags.

riton · 2023-06-19T06:23:51Z

Same need with different Use Case here.

At IN2P3-CC, we're planning to manage part of our documentation from Gitlab.
CI/CD jobs would update Bookstack according to the Gitlab repository content which acts as our source of truth.
Our tool produces html that is sent to the Bookstack page API.

Since the html returned by a get page API is slightly modified, our tool is unable to detect (without heavy html introspection) if a page should be updated or not. Indeed remote page HTML always differs from locally generated HTML.

The solution we're planing to use updates the page on each call. As you may expect, this is not ideal and pollutes Activity Log.

If we are given access to the exact HTML that was initially sent to Bookstack, the problem vanishes.

To provide a way to see the original un-pre-processed database HTML content. For #4310

ssddanbrown · 2023-06-20T16:26:05Z

I have now added this within 8b935e7, and it will be part of the next release.
Thanks @deleyva for the original request here.

@riton Just a note on your use-case, this new property will provide the raw html stored in the BookStack database.
BookStack does do some pre-storage-processing of HTML content too, meaning this won't provide the exact HTML that was originally sent to BookStack. Specifically supporting that use-case would be a more substantial request that I would not be sure about including.

Since it sounds like you just need to check if the content matches your Gitlab side of things (One-direction check), here's a potential creative workaround:

Create a hash for the incoming gitlab content on change. Store that hash (locally to API system, or could sneak it into BookStack content or as a page tag), then on next update, compare the existing hash (if exists) to the new Gitlab content hash, update only if hashes differ.

deleyva · 2023-06-22T10:25:52Z

Thank you very much for devolping this endpoint! It helps a lot.

riton · 2023-06-24T14:00:57Z

As suggested by @ssddanbrown

Create a hash for the incoming gitlab content on change. Store that hash (locally to API system, or could sneak it into BookStack content or as a page tag), then on next update, compare the existing hash (if exists) to the new Gitlab content hash, update only if hashes differ.

We're appending a <meta name="page-cksum" content="XXXXX"> (not in the <head> section of the HTML Page, but 🤷 ) to each Page generated through our C.I / C.D system. Works like a charm 👍

deleyva added the 🔩 API Request label Jun 15, 2023

ssddanbrown added this to the Next Feature Release milestone Jun 20, 2023

ssddanbrown added a commit that referenced this issue Jun 20, 2023

Pages API: Made raw_html available on page responses

8b935e7

To provide a way to see the original un-pre-processed database HTML content. For #4310

ssddanbrown closed this as completed Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a field to the API endpoint /api/pages/{id} to get the raw html #4310

Add a field to the API endpoint /api/pages/{id} to get the raw html #4310

deleyva commented Jun 15, 2023

ssddanbrown commented Jun 15, 2023 •

edited

Loading

riton commented Jun 19, 2023

ssddanbrown commented Jun 20, 2023

deleyva commented Jun 22, 2023

riton commented Jun 24, 2023

Add a field to the API endpoint /api/pages/{id} to get the raw html #4310

Add a field to the API endpoint /api/pages/{id} to get the raw html #4310

Comments

deleyva commented Jun 15, 2023

API Endpoint or Feature

Use-Case

Additional context

ssddanbrown commented Jun 15, 2023 • edited Loading

riton commented Jun 19, 2023

ssddanbrown commented Jun 20, 2023

deleyva commented Jun 22, 2023

riton commented Jun 24, 2023

ssddanbrown commented Jun 15, 2023 •

edited

Loading