Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to create an offline index in PHP or Python? #442

Open
ddofborg opened this issue Jun 11, 2024 · 1 comment
Open

Is there a way to create an offline index in PHP or Python? #442

ddofborg opened this issue Jun 11, 2024 · 1 comment

Comments

@ddofborg
Copy link

I would like to use FS on the client side, but precompute the index offline, so the client can start fast. Is there a way to do so?

@donbowman
Copy link

Here's a WIP for mine w/ wordpress, doing it offline. Its not finished.

npm install flexsearch
npm install html-to-text
const { convert } = require('html-to-text');
const flexsearch = require('flexsearch');

function h2t(body) {
    const options = {
        wordwrap: 130,
    };
    return convert(body, options);
}

async function getFAQ(index, url) {
    const fp = fetch(url);
    let num = await fp.then(response => {
        if (!response.ok) {
            return [];
        }
        return response.json();
    }).then(faqs => {
        let num = 0;
        for (const faq of faqs) {
            num = num + 1;
            console.log(h2t(faq.title.rendered));
            let doc = {
                "title": h2t(faq.title.rendered),
                "excerpt": h2t(faq.excerpt.rendered),
                "content": h2t(faq.content.rendered),
            };
            index.add(doc);
            /*
            console.log(h2t(faq.title.rendered));
            console.log(h2t(faq.excerpt.rendered));
            console.log(h2t(faq.content.rendered));
            */
        }
        return num;
    });
    return num;
}

async function getFAQS(index, url) {
    let num = 50;
    for (let page = 1; num == 50; page++) {
        console.log("do page ", page, "num = ", num);
        const _url = `${url}?per_page=10&page=${page}`;
        num = await getFAQ(index, _url);
    }

}

async function createIndex() {
    const index = new flexsearch.Document({
        tokenize: "forward",
        optimize: true,
        resolution: 9,
        cache: 100,
        worker: true,
        document: {
            id: "id",
            tag: "tag",
            store: [
                "title", "excerpt", "content"
            ],
            index: [
            {
                field: "title",
                tokenize: "forward",
                optimize: true,
                resolution: 9
            },
            {
                field:  "excerpt",
                tokenize: "strict",
                optimize: true,
                resolution: 9,
                minlength: 3,
                context: {
                    depth: 1,
                    resolution: 3
                }
            },
            {
                field:  "content",
                tokenize: "strict",
                optimize: true,
                resolution: 9,
                minlength: 3,
                context: {
                    depth: 1,
                    resolution: 3
                }
            }
            ]
        }
    });

    const faq = "https://www.agilicus.com/wp-json/wp/v2/ufaq";
//    ?per_page=10"
    await getFAQS(index, faq);
    return index;
}

//    https://www.agilicus.com/wp-json/wp/v2/ufaq?per_page=100
//const index = await createIndex();

(async() => {
    console.log('before start');
    const index = await createIndex();
    console.log('after start');
    console.log(exp);
    const result = await index.search('connector', 10);
    console.log(result);
})();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants