Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add zh docs for GraphScope #67

Merged
merged 37 commits into from
Jan 7, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
14fedf7
add zh-cn.
yecol Jan 6, 2021
26d256f
Initial version of the Chinese docs for GIE (#68)
youyangy Jan 6, 2021
0915a56
[BUGFIX] Expose the correct external IP of GIE. (#69)
lidongze0629 Jan 7, 2021
e4b58f4
Always show "languages" selector.
sighingnow Jan 7, 2021
86aa823
Chinese docs for learning engine.
sighingnow Jan 7, 2021
05b257b
Chinese docs for install and deployment.
lidongze0629 Jan 7, 2021
3681c79
update zh/analytics_engine
acezen Jan 7, 2021
e05540d
update.
yecol Jan 7, 2021
3aa4673
Merge branch 'docs' of https://github.com/yecol/GraphScope into docs
yecol Jan 7, 2021
3b464c7
add zh loading graph doc
siyuan0322 Jan 7, 2021
b5e16ca
Merge pull request #1 from yecol/zsy/docs
siyuan0322 Jan 7, 2021
f7e7bd2
update.
yecol Jan 7, 2021
5778a1e
Merge branch 'docs' of https://github.com/yecol/GraphScope into docs
yecol Jan 7, 2021
dcd081d
Fix cross-reference links in GIE docs (#74)
youyangy Jan 7, 2021
0804657
update.
yecol Jan 7, 2021
678e8b4
update.
yecol Jan 7, 2021
a4f4ab1
update.
yecol Jan 7, 2021
e9288e5
update.
yecol Jan 7, 2021
ab91c58
add zh-cn.
yecol Jan 6, 2021
548aa6c
update.
yecol Jan 7, 2021
8cdd531
Always show "languages" selector.
sighingnow Jan 7, 2021
87685a3
Chinese docs for learning engine.
sighingnow Jan 7, 2021
257eddc
Chinese docs for install and deployment.
lidongze0629 Jan 7, 2021
4b159aa
update zh/analytics_engine
acezen Jan 7, 2021
4206ca0
update.
yecol Jan 7, 2021
beb1b29
add zh loading graph doc
siyuan0322 Jan 7, 2021
2b9eac4
update.
yecol Jan 7, 2021
e28bf1c
update.
yecol Jan 7, 2021
38ccbb4
update.
yecol Jan 7, 2021
0c0da25
update.
yecol Jan 7, 2021
05c566f
update.
yecol Jan 7, 2021
cffc7d5
Merge branch 'docs' of https://github.com/yecol/GraphScope into docs
yecol Jan 7, 2021
a8d9838
update link.
yecol Jan 7, 2021
4626f3c
udpate.
yecol Jan 7, 2021
b8a25a7
updte.
yecol Jan 7, 2021
f1091ac
Fixes content collapse for chinese docs.
sighingnow Jan 7, 2021
c376f7e
Constraint vineyard version as <0.1.5.
sighingnow Jan 7, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
369 changes: 369 additions & 0 deletions README-zh.md

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
[![GraphScope CI](https://github.com/alibaba/GraphScope/workflows/GraphScope%20CI/badge.svg)](https://github.com/alibaba/GraphScope/actions?workflow=GraphScope+CI)
[![Docs](https://github.com/alibaba/GraphScope/workflows/Docs/badge.svg)](https://graphscope.io/docs)
[![Coverage](https://codecov.io/gh/alibaba/GraphScope/branch/main/graph/badge.svg)](https://codecov.io/gh/alibaba/GraphScope)
[![Translation](https://img.shields.io/badge/Translation-%E4%B8%AD%E6%96%87%E7%89%88-success)](README-zh.md)

GraphScope is a unified distributed graph computing platform that provides a one-stop environment for performing diverse graph operations on a cluster of computers through a user-friendly Python interface. GraphScope makes multi-staged processing of large-scale graph data on compute clusters simple by combining several important pieces of Alibaba technology: including [GRAPE](https://github.com/alibaba/libgrape-lite), [MaxGraph](interactive_engine/), and [Graph-Learn](https://github.com/alibaba/graph-learn) (GL) for analytics, interactive, and graph neural networks (GNN) computation, respectively, and the [vineyard](https://github.com/alibaba/libvineyard) store that offers efficient in-memory data transfers.

Expand All @@ -31,7 +32,6 @@ For Linux distributions, we provide a script to install the above dependencies a
```bash
# run the environment preparing script.
./scripts/prepare_env.sh

```

### Installation
Expand All @@ -51,19 +51,19 @@ Please note that we have not hardened this release for production use and it lac

[`ogbn-mag`](https://ogb.stanford.edu/docs/nodeprop/#ogbn-mag) is a heterogeneous network composed of a subset of the Microsoft Academic Graph. It contains 4 types of entities(i.e., papers, authors, institutions, and fields of study), as well as four types of directed relations connecting two entities.

Given the heterogeneous ogbn-mag data, the task is to predict the class of each paper. Node classification can identify papers in multiple venues, which represent different groups of scientific work on different topics. We apply both the attribute and structural information to classify papers. In the graph, each paper node contains a 128-dimensional word2vec vector representing its content, which is obtained by averaging the embeddings of words in its title and abstract. The embeddings of individual words are pre-trained. The structural information is computed on-the-fly.
Given the heterogeneous `ogbn-mag` data, the task is to predict the class of each paper. Node classification can identify papers in multiple venues, which represent different groups of scientific work on different topics. We apply both the attribute and structural information to classify papers. In the graph, each paper node contains a 128-dimensional word2vec vector representing its content, which is obtained by averaging the embeddings of words in its title and abstract. The embeddings of individual words are pre-trained. The structural information is computed on-the-fly.

<div align="center">
<img src="https://graphscope.io/docs/_images/how-it-works.png" width="600" alt="how-it-works" />
</div>

The figure shows the flow of execution when a client Python program is executed..
The figure shows the flow of execution when a client Python program is executed.

- *Step 1*. Create a session or workspace in GraphScope.
- *Step 2*. Define schema and load the graph.
- *Step 3*. Query graph data.
- *Step 4*. Run graph algorithms.
- *Step 5*. Run graph-based machine learing tasks.
- *Step 5*. Run graph-based machine learning tasks.
- *Step 6*. Close the session.

### Creating a session
Expand Down Expand Up @@ -97,7 +97,7 @@ Taking `ogbn-mag` as example, the figure below shows the model of the property g
<img src="https://graphscope.io/docs/_images/sample_pg.png" width="600" alt="sample-of-property-graph" />
</div>

This graph has fours kinds of vertices, labeled as `paper`, `author`, `institution` and `field_of_study`. There are four kinds of edges connecting them, each kind of edges has a label and specifies the vertex labels for its two ends. For example, `cites` edges connect two vertices labeled `paper`. Another example is `writes`, it requires the source vertex is labeled `author` and the destination is a `paper` vertex. All the vertices and edges may have properties. e.g., `paper` vertices have properties like features, publish year, subject label, etc.
This graph has four kinds of vertices, labeled as `paper`, `author`, `institution` and `field_of_study`. There are four kinds of edges connecting them, each kind of edges has a label and specifies the vertex labels for its two ends. For example, `cites` edges connect two vertices labeled `paper`. Another example is `writes`, it requires the source vertex is labeled `author` and the destination is a `paper` vertex. All the vertices and edges may have properties. e.g., `paper` vertices have properties like features, publish year, subject label, etc.

To load this graph to GraphScope, one may use the code below with the [data files](https://graphscope.oss-accelerate.aliyuncs.com/ogbn_mag_small.tar.gz). Please download and extract it to the mounted dir on local(in this case, `~/test_data`).

Expand Down
18 changes: 18 additions & 0 deletions analytical_engine/README-zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# GraphScope 图分析引擎 - GRAPE

[![Translation](https://img.shields.io/badge/Translation-English-success)](https://github.com/alibaba/GraphScope/tree/main/analytical_engine)


GraphScope 中的图分析引擎继承自 **GRAPE**,该系统实现了论文 [Parallelizing Sequential Graph Computations](https://dl.acm.org/doi/10.1145/3282488) 中提出的不动点计算模型。

与现有系统不同,GRAPE 通过自动并行化整体的单机顺序图算法,[即插即用](https://github.com/alibaba/libgrape-lite/blob/master/examples/analytical_apps/sssp/sssp_auto.h)已有的图算法程序,使其很容易的运行在分布式环境,高效处理大规模图。除了易于编程之外,**GRAPE** 还被设计为[高效](https://github.com/alibaba/libgrape-lite/blob/master/Performance.md)和[高度可拓展](https://github.com/alibaba/libgrape-lite/blob/master/examples/gnn_sampler)的,以应对现实图应用程序多变的规模,多样性和复杂性。

GRAPE 的核心轻量版本以 [libgrape-lite](https://github.com/alibaba/libgrape-lite/) 开源。GraphScope 中的分析引擎扩展了 libgrape-lite 的功能,支持了可变子图,[vineyard](https://github.com/alibaba/libvineyard/) 支持以及引擎的服务模式等。

## 论文列表

- Wenfei Fan, Jingbo Xu, Wenyuan Yu, Jingren Zhou, Xiaojian Luo, Ping Lu, Qiang Yin, Yang Cao, and Ruiqi Xu. [Parallelizing Sequential Graph Computations](https://dl.acm.org/doi/10.1145/3282488). ACM Transactions on Database Systems (TODS) 43(4): 18:1-18:39.

- Wenfei Fan, Jingbo Xu, Yinghui Wu, Wenyuan Yu, Jiaxin Jiang. [GRAPE: Parallelizing Sequential Graph Computations](http://www.vldb.org/pvldb/vol10/p1889-fan.pdf). The 43rd International Conference on Very Large Data Bases (VLDB), demo, 2017 (the Best Demo Award).

- Wenfei Fan, Jingbo Xu, Yinghui Wu, Wenyuan Yu, Jiaxin Jiang, Zeyu Zheng, Bohan Zhang, Yang Cao, and Chao Tian. [Parallelizing Sequential Graph Computations](https://dl.acm.org/doi/10.1145/3035918.3035942). ACM SIG Conference on Management of Data (SIGMOD), 2017 (the Best Paper Award).
2 changes: 2 additions & 0 deletions analytical_engine/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# GraphScope Analytical Engine - GRAPE

[![Translation](https://img.shields.io/badge/translation-%E4%B8%AD%E6%96%87%E7%89%88-success)](README-zh.md)

The analytical engine in GraphScope originated from **GRAPE**, a system that implemented the fix-point model proposed in the paper [Parallelizing Sequential Graph Computations](https://dl.acm.org/doi/10.1145/3282488).

GRAPE differs from prior systems in its ability to parallelize sequential graph algorithms as a whole by following the PIE programming model from the paper. Sequential algorithms can be easily ["plugged into"](https://github.com/alibaba/libgrape-lite/blob/master/examples/analytical_apps/sssp/sssp_auto.h) **GRAPE** with only minor changes and get parallelized to handle large graphs efficiently. In addition to the ease of programming, **GRAPE** is designed to be [highly efficient](https://github.com/alibaba/libgrape-lite/blob/master/Performance.md) and [flexible](https://github.com/alibaba/libgrape-lite/blob/master/examples/gnn_sampler), to cope the scale, variety and complexity from real-life graph applications.
Expand Down
23 changes: 15 additions & 8 deletions coordinator/gscoordinator/coordinator.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,15 +157,22 @@ def __del__(self):
self._cleanup()

def _config_logging(self, log_level):
if log_level:
log_level = getattr(logging, log_level.upper(), logging.INFO)
else:
log_level = logging.INFO
logging.getLogger("graphscope").setLevel(log_level)
logging.basicConfig(
format="%(asctime)s [%(levelname)s][%(module)s:%(lineno)d]: %(message)s",
stream=sys.stdout,
"""Set log level basic on config.
Args:
log_level (str): Log level of stdout handler
"""
logger = logging.getLogger("graphscope")
logger.setLevel(logging.DEBUG)

stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setLevel(log_level)

formatter = logging.Formatter(
"%(asctime)s [%(levelname)s][%(module)s:%(lineno)d]: %(message)s"
)
stdout_handler.setFormatter(formatter)

logger.addHandler(stdout_handler)

def ConnectSession(self, request, context):
# A session is already connected.
Expand Down
2 changes: 1 addition & 1 deletion coordinator/gscoordinator/template/CMakeLists.template
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
###############################################################################

project(example C CXX)
cmake_minimum_required(VERSION 2.8)
cmake_minimum_required(VERSION 3.5)

option(CYTHON_PREGEL_APP "Whether to build cython pregel app." False)
option(CYTHON_PIE_APP "Whether to build cython pie app" False)
Expand Down
2 changes: 1 addition & 1 deletion docs/_templates/layout.html
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
{% block footer %}
{{ footer }}

<div class="rst-versions" data-toggle="rst-versions" role="note" aria-label="versions">
<div class="rst-versions shift-up" data-toggle="rst-versions" role="note" aria-label="versions">
<span class="rst-current-version" data-toggle="rst-current-version">
<span class="fa fa-book"> Read the Docs</span>
v: latest
Expand Down
22 changes: 11 additions & 11 deletions docs/analytics_engine.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ analysis, community detection, centrality computations.

Built-in algorithms can be easily invoked over loaded graphs. For example,

.. code:: ipython
.. code:: python

from graphscope import pagerank
from graphscope import lpa
Expand Down Expand Up @@ -68,7 +68,7 @@ Users may want to fetch the results to the client, or write to cloud or distribu

There is a list of supported method to retrieve the results.

.. code:: iPython
.. code:: python

# fetch to data structures
result_pr.to_numpy()
Expand All @@ -89,7 +89,7 @@ There is a list of supported method to retrieve the results.
In addition, as shown in the Getting_Started, computation results can add back to
the graph as a new property (column) of the vertices(edges).

.. code:: iPython
.. code:: python

simple_g = sub_graph.project_to_simple(vlabel="paper", elabel="cites")
ret = graphscope.kcore(simple_g, k=5)
Expand All @@ -104,7 +104,7 @@ also be a part of the result, e.g., the vertex id. We reserve three keywords for
`r` represents the result, `v` and `e` for vertices and edges, respectively.
Here are some examples for selectors on result processing.

.. code:: iPython
.. code:: python

# get the results on the vertex
result_pr.to_numpy('r')
Expand Down Expand Up @@ -141,7 +141,7 @@ programming model in a pure Python mode.

To implement this, a user just need to fulfill this class.

.. code:: ipython
.. code:: python

@graphscope.analytical.udf.pie
class YourAlgorithm(AppAssets):
Expand All @@ -166,7 +166,7 @@ can be found in :ref:`Cython SDK API`.

Let's take SSSP as example, a user defined SSSP in PIE model may be like this.

.. code:: ipython
.. code:: python

@graphscope.analytical.udf.pie
class SSSP:
Expand Down Expand Up @@ -238,7 +238,7 @@ In addition to the sub-graph based PIE model,
`Pregel` model as well.
You may develop an algorithms in `Pregel` model by implementing this.

.. code:: ipython
.. code:: python

@pregel(vd_type='double', md_type='double')
class YourPregelAlgorithm(AppAssets):
Expand All @@ -259,7 +259,7 @@ Differ from the PIE model, the decorator for this class is @graphscope.analytica
And the functions to be implemented is defined on vertex, rather than the fragment.
Take SSSP as example, the algorithm in Pregel model looks like this.

.. code:: ipython
.. code:: python

# decorator, and assign the types for vertex data, message data.
@pregel(vd_type='double', md_type='double')
Expand Down Expand Up @@ -299,7 +299,7 @@ Run Your Own Algorithms

To run your own algorithms, you may trigger it in place where you defined it.

.. code:: ipython
.. code:: python

import graphscope

Expand All @@ -315,14 +315,14 @@ To run your own algorithms, you may trigger it in place where you defined it.

After developing and testing, you may want to save it for the future use.

.. code:: ipython
.. code:: python

SSSP_Pregel.to_gar("file:///var/graphscope/udf/my_sssp_pregel.gar")


Later, you can load your own algorithm from the gar package.

.. code:: ipython
.. code:: python

import graphscope

Expand Down
2 changes: 1 addition & 1 deletion docs/deployment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ A session encapsulates the control and state of the GraphScope engines.
It serves as the entrance in the python client to GraphScope. A session
allows users to deploy and connect GraphScope on a k8s cluster.

.. code:: ipython
.. code:: python

import graphscope

Expand Down
2 changes: 1 addition & 1 deletion docs/developer_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ You may want to re-install the python client on local.

To test the newly built binaries, manually open a session and assigned your image:

.. code:: ipython
.. code:: python

import graphscope

Expand Down
14 changes: 7 additions & 7 deletions docs/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Launching Session

To use GraphScope, we need to establish a :ref:`Session` in a python interpreter.

.. code:: ipython
.. code:: python

import graphscope
sess = graphscope.session()
Expand Down Expand Up @@ -88,7 +88,7 @@ Paper vertices have properties like features, publish year, subject label, etc.

To load this graph to GraphScope, one may use the code below.

.. code:: ipython
.. code:: python

g = sess.load_from(
vertices={
Expand Down Expand Up @@ -144,7 +144,7 @@ In this example, we use graph queries to find citation counts
for a particular author, and to derive a subgraph by
extracting publications in specific time out of the entire graph.

.. code:: ipython
.. code:: python

# get the entrypoint for submitting Gremlin queries on graph g.
interactive = sess.gremlin(g)
Expand Down Expand Up @@ -173,7 +173,7 @@ you may want to project the property graph to a simple graph at first.
Continue our example, we run k-core decomposition and triangle counting
to generate the structural features of each paper node.

.. code:: ipython
.. code:: python

# exact a subgraph of publication within a time range
sub_graph = interactive.subgraph("g.V().has('year', inside(2014, 2020)).outE('cites')")
Expand Down Expand Up @@ -207,7 +207,7 @@ each of which represents a venue (e.g. pre-print and conference).
To achieve this, first we launch a learning engine and build
a graph with features following the last step.

.. code:: ipython
.. code:: python

# define the features for learning
paper_features = []
Expand All @@ -227,7 +227,7 @@ a graph with features following the last step.

Then we define the training and testing process, and run it.

.. code:: ipython
.. code:: python

from graphscope.learning.examples import GCN
from graphscope.learning.graphlearn.python.model.tf.trainer import LocalTFTrainer
Expand Down Expand Up @@ -287,7 +287,7 @@ Closing Session

At last, we close the session after processing all graph tasks.

.. code:: ipython
.. code:: python

sess.close()

Expand Down
10 changes: 6 additions & 4 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,16 @@ and the vineyard store that offers efficient in-memory data transfers.
reference/python_index
reference/analytical_engine_index

.. toctree::
:maxdepth: 2
:hidden:

zh/index

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

.. toctree::
:hidden:

zh/index.rst
3 changes: 0 additions & 3 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,6 @@ Alternatively, You may want to install `WSL2 <https://docs.microsoft.com/zh-cn/w

.. code:: bash

# if on WSL2, we need to enable systemd first. Otherwise, skip this step.
./script/wsl/enable_systemd.sh

# run the environment preparing script.
./scripts/prepare_env.sh

Expand Down
2 changes: 1 addition & 1 deletion docs/interactive_engine.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ Nested traversal is also critical to the support for loops, which are expressed
An example
~~~~~~~~~~

Below shows a Gremlin query for the above example that tries to find cyclic paths of length ``k``, starting from a given account.
Below shows a Gremlin query for cycle detection, which tries to find cyclic paths of length ``k`` starting from a given account.

.. code:: java

Expand Down
Loading