Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The analytical algorithm performance doesn't meet the expectation #2898

Closed
vegetableysm opened this issue Jun 15, 2023 · 4 comments · Fixed by #2908
Closed

The analytical algorithm performance doesn't meet the expectation #2898

vegetableysm opened this issue Jun 15, 2023 · 4 comments · Fixed by #2908
Labels
component:gae good first issue Good for newcomers performance Performance related issues

Comments

@vegetableysm
Copy link
Collaborator

vegetableysm commented Jun 15, 2023

I tested pagerank and sssp algorithm on GraphScope and Gemini. And found that the Gemini performed much faster than the GraphScope. Here is the result of pagerank test with the dataset of soc-LiveJournal1 . The iterations is 20:

platform time usage
GraphScope 1.4s
Gemini 0.37s
Libgrape-lite 0.15s

The script of GraphScope

import graphscope
import time
import datetime

sess = graphscope.session(num_workers=1, cluster_type='hosts')

graph = sess.g()

start = time.time()
graph = graph.add_edges('../data_set/live_journal/soc-livejournal.csv', src_label='v', dst_label='v', properties = [])
end = time.time()

print("Loading time: %f" % (end - start))

start = time.time()
ret1 = graphscope.pagerank(graph, max_round = 20)
end = time.time()

print("Running time: %f" % (end - start))

print(ret1.to_dataframe(selector={'id': 'v.id', 'label': 'r'}))

sess.close()

The command of running libgrape-lite
mpirun -n 1 ./run_app --vfile ../../data_set/live_journal/soc-livejournal.vertex.csv --efile ../../data_set/live_journal/soc-livejournal.mtx --application pagerank --out_prefix ./output_pagerank --directed -pr_mr 20

The running command of Gemini
./toolkits/pagerank ./data_set/live_journal/soc-livejournal.binarye 4033137 20

Each of the above three tests sets one partition.

The problem is that the graphscope is 10 times worse than the libgrape-lite. I don't know if my test script is wrong, please advise. Thanks!

@welcome
Copy link

welcome bot commented Jun 15, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! And a maintainer will get back to you shortly!
Please feel free to contact us on DingTalk, WeChat account(graphscope) or Slack. We are happy to answer your questions responsively.

@siyuan0322
Copy link
Collaborator

One reason of that is it may includes the compilation time and projection time.
Other reasons are under investigation.

@sighingnow sighingnow added good first issue Good for newcomers component:gae performance Performance related issues labels Jun 15, 2023
@sighingnow sighingnow changed the title The algorithm performance is abnormal The analytical algorithm performance doesn't meet the expectation Jun 15, 2023
@vegetableysm
Copy link
Collaborator Author

The first ten lines of edge file:

src,dst
2,1
3,1
5,1
6,1
7,1
8,1
9,1
10,1
16,1

@siyuan0322
Copy link
Collaborator

Found that the timing method includes the python codes to assemble the op, and the round trip time of RPC, the dynamic loading of libraries, thus add much over head to querying time (which is less than 1 second in this experiment), so the overhead is huge.

Add a new log to print the actual evaluating time of application in the grape_engine, which should as the realiable metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:gae good first issue Good for newcomers performance Performance related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants