Skip to content

Commit 8e9c461

Browse files
nieblesajaynarayanan
authored andcommitted
fixed indexing of external posts (alshedivat#2983)
This should fix several issues with indexing external posts, including alshedivat#1828. In short, I found that the issue with indexing was that the index builder was receiving 'empty' documents. To fix that, I'm setting the document content to be the post content as retrieved from the rss feed or the text extracted from the external page. I've tested with various blog sources and it seems to be working as expected now.
1 parent 760e9c4 commit 8e9c461

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

_plugins/external-posts.rb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ def create_document(site, source_name, url, content)
6262
doc.data['description'] = content[:summary]
6363
doc.data['date'] = content[:published]
6464
doc.data['redirect'] = url
65+
doc.content = content[:content]
6566
site.collections['posts'].docs << doc
6667
end
6768

@@ -90,8 +91,12 @@ def fetch_content_from_url(url)
9091
parsed_html = Nokogiri::HTML(html)
9192

9293
title = parsed_html.at('head title')&.text.strip || ''
93-
description = parsed_html.at('head meta[name="description"]')&.attr('content') || ''
94-
body_content = parsed_html.at('body')&.inner_html || ''
94+
description = parsed_html.at('head meta[name="description"]')&.attr('content')
95+
description ||= parsed_html.at('head meta[name="og:description"]')&.attr('content')
96+
description ||= parsed_html.at('head meta[property="og:description"]')&.attr('content')
97+
98+
body_content = parsed_html.search('p').map { |e| e.text }
99+
body_content = body_content.join() || ''
95100

96101
{
97102
title: title,

0 commit comments

Comments
 (0)