2.5x faster #139

vprelovac · 2020-02-20T00:46:18Z

Performance optimization, about 2.5x faster

miso-belica

Thanks for the changes. Good job 👍

I added some comments so please try to react on them (by code or another comment :) and if it would be possible could you maybe create some test for method _create_matrix so we are sure it's the same as before?

Oh, and I almost forgot. If you have any measurements/benchmark to show how the changes result in the 2.5x speedup it would be great if you can provide that too.

sumy/summarizers/text_rank.py

miso-belica · 2020-02-21T19:03:42Z

sumy/summarizers/text_rank.py

-        for i, words_i in enumerate(sentences_as_words):
-            for j, words_j in enumerate(sentences_as_words):
-                weights[i, j] = self._rate_sentences_edge(words_i, words_j)
+        for i in range(0, sentences_count-1):


I believe you should do range(0, sentences_count) here because list(range(0, 3)) == [0, 1, 2]. But I would prefer you return enumerate(sentences_as_words). I think the one last missing word may a reason why the test fails on CI. So the code after the changes will be:

for i, words_i in enumerate(sentences_as_words): for j in range(i+1, sentences_count): rating = self._rate_sentences_edge(words_i, sentences_as_words[j]) weights[i, j] = rating weights[j, i] = rating

EDIT: now I realized, this is not the case when you reach the end you fill the last row by weights[j, i] = rating. But still, the test fails so please check it why.

sumy/summarizers/text_rank.py

miso-belica · 2020-02-23T16:32:21Z

Don't have time to do these, feel free to reject and implement changes yourself.

@vprelovac Thanks for the inspiration. I implemented the optimizations in the related PR but was not so aggressive so the tests pass and it is working the same as before.

But still, the problem was in the diagonal of the matrix which should not affect the score so I will dig deeper into it to find out why we can't omit the computing of the diagonal.

Have a nice day :)

vprelovac added 2 commits February 19, 2020 16:45

2.5x faster

3cff49f

Update text_rank.py

984de98

miso-belica requested changes Feb 21, 2020

View reviewed changes

miso-belica assigned vprelovac Feb 21, 2020

miso-belica added the enhancement label Feb 21, 2020

miso-belica mentioned this pull request Feb 23, 2020

Speedup of the TextRank algorithm #140

Merged

miso-belica closed this Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.5x faster #139

2.5x faster #139

vprelovac commented Feb 20, 2020 •

edited

Loading

miso-belica left a comment •

edited

Loading

miso-belica Feb 21, 2020 •

edited

Loading

miso-belica commented Feb 23, 2020

2.5x faster #139

2.5x faster #139

Conversation

vprelovac commented Feb 20, 2020 • edited Loading

miso-belica left a comment • edited Loading

Choose a reason for hiding this comment

miso-belica Feb 21, 2020 • edited Loading

Choose a reason for hiding this comment

miso-belica commented Feb 23, 2020

vprelovac commented Feb 20, 2020 •

edited

Loading

miso-belica left a comment •

edited

Loading

miso-belica Feb 21, 2020 •

edited

Loading