diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index cb63eeebc..a59e2cffe 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -1,4 +1,4 @@ -name: GraphAr CI +name: GraphAr C++ CI on: # Trigger the workflow on push or pull request, @@ -53,7 +53,7 @@ jobs: run: | mkdir build pushd build - cmake .. -DBUILD_TESTS=ON -DBUILD_EXAMPLES=ON + cmake ../cpp -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON -DBUILD_EXAMPLES=ON popd - name: Cpp Format and lint @@ -67,7 +67,7 @@ jobs: # validate format function prepend() { while read line; do echo "${1}${line}"; done; } - make clformat + make gar-clformat GIT_DIFF=$(git diff --ignore-submodules) if [[ -n $GIT_DIFF ]]; then echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" @@ -91,7 +91,7 @@ jobs: function ec() { [[ "$1" == "-h" ]] && { shift && eval $* > /dev/null 2>&1; ec=$?; echo $ec; } || eval $*; ec=$?; } - ec make cpplint + ec make gar-cpplint if [[ "$ec" != "0" ]]; then echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" echo "| cpplint failures found! Run: " @@ -114,5 +114,7 @@ jobs: - name: Test run: | - cd build && make test + cd build + export GAR_TEST_DATA=$PWD/../testing/ + make test diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 3821a63c2..6cfb20eaa 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -58,14 +58,12 @@ jobs: run: | sudo apt-get update -y sudo apt-get install -y ca-certificates cmake doxygen python3-pip - sudo pip3 install -r requirements-dev.txt + sudo pip3 install -r docs/requirements.txt - name: Generate Doc run: | - mkdir build - pushd build - cmake .. - make doc + pushd docs + make html popd - name: Preview using surge diff --git a/.gitignore b/.gitignore index 2aea7c86e..1acb721c4 100644 --- a/.gitignore +++ b/.gitignore @@ -1,15 +1,6 @@ /build/ -spark/target/ .vscode .idea .DS_store -.DS_Store .cache -# docs -/docs/_build/ - -# examples -/examples/*/build - - diff --git a/.gitmodules b/.gitmodules index 24e8e980f..90bbe0715 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,9 +1,9 @@ -[submodule "test/gar-test"] - path = test/gar-test +[submodule "testing"] + path = testing url = https://github.com/GraphScope/gar-test.git -[submodule "thirdparty/yaml-cpp"] - path = thirdparty/yaml-cpp +[submodule "cpp/thirdparty/yaml-cpp"] + path = cpp/thirdparty/yaml-cpp url = https://github.com/jbeder/yaml-cpp.git -[submodule "thirdparty/Catch2"] - path = thirdparty/Catch2 +[submodule "cpp/thirdparty/Catch2"] + path = cpp/thirdparty/Catch2 url = https://github.com/catchorg/Catch2.git diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index 6e6b804ce..ad93be01e 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -55,13 +55,13 @@ into the Git repository, you could first install `pre-commit`_ by .. code:: bash - pip3 install pre-commit + $ pip3 install pre-commit The configure the necessary pre-commit hooks with .. code:: bash - pre-commit install --install-hooks + $ pre-commit install --install-hooks Minor Fixes ^^^^^^^^^^^^ @@ -86,7 +86,7 @@ A good branch name would be (where issue #42 is the ticket you're working on): .. code:: shell - git checkout -b 42-add-chinese-translations + $ git checkout -b 42-add-chinese-translations Get the test suite running ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -97,22 +97,33 @@ Now initialize the submodules of GraphAr: .. code:: shell - git submodule update --init + $ git submodule update --init + +For the C++ Library, Check that the system has the `GraphAr C++ Dependencies`_. Then you can do an out-of-source build using CMake and build the test suite: .. code:: shell - mkdir build - cd build - cmake .. -DBUILD_TESTS=ON - make -j$(nproc) + $ mkdir build + $ cd build + $ cmake ../cpp -DBUILD_TESTS=ON + $ make -j$(nproc) Now you should be able to run the test suite: .. code:: shell - make test + $ make test + +For the Spark Library, Check that the system has the `GraphAr Spark Dependencies`_. + +Then you build and run test suite using Maven: + +.. code:: shell + + $ cd spark + $ mvn test How to generate the document ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -123,9 +134,10 @@ The documentation is generated using Doxygen and sphinx. You can build GraphAr's .. code:: shell - make doc + $ cd docs + $ make html -The HTML documentation will be available under `docs/_build/html`. +The HTML documentation will be available under ``docs/_build/html``. Implement your fix or feature ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -143,13 +155,15 @@ You can format your code by the command: .. code:: shell - make clformat + $ cd build + $ make gar-clformat You can check & fix style issues by running the *cpplint* linter with the command: .. code:: shell - make cpplint + $ cd build + $ make gar-cpplint Submitting your changes ^^^^^^^^^^^^^^^^^^^^^^^ @@ -173,17 +187,17 @@ up to date with GraphAr's main branch: .. code:: shell - git remote add upstream https://github.com/alibaba/GraphAr.git - git checkout main - git pull upstream main + $ git remote add upstream https://github.com/alibaba/GraphAr.git + $ git checkout main + $ git pull upstream main Then update your feature branch from your local copy of main, and push it! .. code:: shell - git checkout 42-add-chinese-translations - git rebase main - git push --set-upstream origin 42-add-chinese-translations + $ git checkout 42-add-chinese-translations + $ git rebase main + $ git push --set-upstream origin 42-add-chinese-translations Finally, go to GitHub and `make a Pull Request`_ :D @@ -206,9 +220,9 @@ To learn more about rebasing in Git, there are a lot of `good `_. e.g.: v0.1.0. .. code:: shell - git tag -a v0.1.0 -m "GraphAr v0.1.0" - git push upstream v0.1.0 + $ git tag -a v0.1.0 -m "GraphAr v0.1.0" + $ git push upstream v0.1.0 3. The release draft will be automatically built to GitHub by GitHub Actions. You can edit the release notes draft on `GitHub `_ to add more details. 4. Publish the release. @@ -419,7 +433,9 @@ to determine whether the failure was caused by the changes in the pull request. .. _interactive rebase: https://help.github.com/en/github/using-git/about-git-rebase -.. _GraphAr Dependencies: https://github.com/alibaba/GraphAr#dependencies +.. _GraphAr C++ Dependencies: https://github.com/alibaba/GraphAr/tree/main/cpp#system-setup + +.. _GraphAr Spark Dependencies: https://github.com/alibaba/GraphAr/tree/main/spark#system-setup .. _Contributor License Agreement: https://cla-assistant.io/alibaba/GraphAr diff --git a/README.rst b/README.rst index 936958f8f..2c07af384 100644 --- a/README.rst +++ b/README.rst @@ -111,77 +111,19 @@ Take the "person knows person" edges to illustrate. Suppose the vertex chunk siz |Edge Physical Table2| -Building the Libraries ----------------------- +Libraries +---------- -Libraries are available for C++ and Spark. - -Prerequisites -^^^^^^^^^^^^^^ - -Basic dependencies: - -- A modern C++ compiler compliant with C++17 standard (g++ >= 7.1 or clang++ >= 5). -- `CMake `_ (>=2.8) - -Dependencies for optional features: - -- `Doxygen `_ (>= 1.8) for generating documentation; -- `sphinx `_ for generating documentation. - -Extra dependencies are required by examples: - -- `BGL `_ (>= 1.58). - - -Building -^^^^^^^^^ - -Once the required dependencies have been installed, go to the root directory of GraphAr and do an out-of-source build using CMake. - -.. code-block:: shell - - git submodule update --init - mkdir build && cd build - cmake .. - make -j$(nproc) - -**Optional**: Using a Custom Namespace - -The :code:`namespace` is configurable. By default, -it is defined in :code:`namespace GraphArchive`; however this can be toggled by -setting :code:`NAMESPACE` option with cmake: - -.. code:: shell - - mkdir build - cd build - cmake .. -DNAMESPACE=MyNamespace - make -j$(nproc) - -Run the test with command: - -.. code-block:: shell - - make test - -Install the GraphAr library: - -.. code-block:: shell - - sudo make install - -Optionally, you can build the documentation for GraphAr library: - -.. code-block:: shell - - # assume doxygen and sphinx has been installed. - pip3 install -r ../requirements-dev.txt --user - make doc +Libraries are provided for reading, writing and transforming files in GraphAr, +now the C++ library and the Spark library are available. And we are going to +provide libraries for more programming languages. +The C++ Library +^^^^^^^^^^^^^^^ +See `GraphAr C++ Library`_ for details about the building of the C++ library. The Spark Library ------------------ +^^^^^^^^^^^^^^^^^ See `GraphAr Spark Library`_ for details about the Spark library. @@ -239,7 +181,9 @@ third-party libraries may not have the same license as GraphAr. .. _GraphAr File Format: https://alibaba.github.io/GraphAr/user-guide/file-format.html -.. _GraphAr Spark Library: https://alibaba.github.io/GraphAr/user-guide/spark-lib.html +.. _GraphAr Spark Library: https://github.com/alibaba/GraphAr/tree/main/spark + +.. _GraphAr C++ Library: https://github.com/alibaba/GraphAr/tree/main/cpp .. _example files: https://github.com/GraphScope/gar-test/blob/main/ldbc_sample/ diff --git a/cpp/.gitignore b/cpp/.gitignore new file mode 100644 index 000000000..4f3221962 --- /dev/null +++ b/cpp/.gitignore @@ -0,0 +1,5 @@ +/build/ +/examples/*/build/ + +apidoc/html +apidoc/xml diff --git a/CMakeLists.txt b/cpp/CMakeLists.txt similarity index 94% rename from CMakeLists.txt rename to cpp/CMakeLists.txt index a9e8deff4..e174e63ea 100644 --- a/CMakeLists.txt +++ b/cpp/CMakeLists.txt @@ -24,7 +24,7 @@ project(graph-archive LANGUAGES C CXX VERSION ${GAR_VERSION}) # ------------------------------------------------------------------------------ option(NAMESPACE "User specific namespace, default if GraphArchive" OFF) -option(BUILD_TESTS "Build unit test" OFF) +option(BUILD_TESTS "Build unit tests" OFF) option(BUILD_EXAMPLES "Build examples" OFF) if (NAMESPACE) @@ -329,36 +329,28 @@ file(GLOB_RECURSE FILES_NEED_LINT "include/gar/*.h" "src/*.cc" "test/*.h" "test/*.cc" ) -add_custom_target(clformat +add_custom_target(gar-clformat COMMAND clang-format --style=file -i ${FILES_NEED_FORMAT} COMMENT "Running clang-format." VERBATIM) -add_custom_target(cpplint - COMMAND ${PROJECT_SOURCE_DIR}/misc/cpplint.py --root=include ${FILES_NEED_LINT} +add_custom_target(gar-cpplint + COMMAND ${PROJECT_SOURCE_DIR}/misc/cpplint.py --root=${PROJECT_SOURCE_DIR}/include ${FILES_NEED_LINT} COMMENT "Running cpplint check." VERBATIM) # ------------------------------------------------------------------------------ -# build docs +# build cpp api doc # ------------------------------------------------------------------------------ find_program(doxygen_EXECUTABLE doxygen NO_CMAKE_SYSTEM_PATH) -find_program(sphinx_build_EXECUTABLE sphinx-build NO_CMAKE_SYSTEM_PATH) -if(doxygen_EXECUTABLE AND sphinx_build_EXECUTABLE) - add_custom_target(doc - COMMAND ${CMAKE_COMMAND} -E make_directory _build +if(doxygen_EXECUTABLE) + add_custom_target(gar-cpp-doc COMMAND ${doxygen_EXECUTABLE} - COMMAND ${sphinx_build_EXECUTABLE} . _build/html - COMMAND cd ../spark && mvn scala:doc && rm -fr ${PROJECT_SOURCE_DIR}/docs/_build/html/reference/spark-api && - cp -fr target/site/scaladocs ${PROJECT_SOURCE_DIR}/docs/_build/html/reference/spark-api - WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/docs + WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/apidoc VERBATIM ) else() if(NOT doxygen_EXECUTABLE) message(STATUS "Cannot find the doxygen executable.") endif() - if(NOT sphinx_build_EXECUTABLE) - message(STATUS "Cannot find the sphinx-build executable.") - endif() endif() diff --git a/cpp/README.rst b/cpp/README.rst new file mode 100644 index 000000000..950e86b32 --- /dev/null +++ b/cpp/README.rst @@ -0,0 +1,114 @@ +GraphAr C++ +============ +This directory contains the code and build system for the GraphAr C++ library. + + +Building GraphAr C++ +-------------------- + +System setup +^^^^^^^^^^^^ + +GraphAr C++ uses CMake as a build configuration system. We recommend building +out-of-source. If you are not familiar with this terminology: + +* **In-source build**: ``cmake`` is invoked directly from the ``cpp`` + directory. This can be inflexible when you wish to maintain multiple build + environments (e.g. one for debug builds and another for release builds) +* **Out-of-source build**: ``cmake`` is invoked from another directory, + creating an isolated build environment that does not interact with any other + build environment. For example, you could create ``cpp/build-debug`` and + invoke ``cmake $CMAKE_ARGS ..`` from this directory + +Building requires: + +* A C++17-enabled compiler. On Linux, gcc 7.1 and higher should be + sufficient. For MacOS, at least clang 5 is required +* CMake 3.5 or higher +* On Linux and macOS, ``make`` build utilities + +Dependencies for optional features: + +* `Doxygen `_ (>= 1.8) for generating documentation + +Extra dependencies are required by examples: + +* `BGL `_ (>= 1.58) + + +Building +^^^^^^^^^ + +All the instructions below assume that you have cloned the GraphAr git +repository and navigated to the ``cpp`` subdirectory: + +.. code-block:: + + $ git clone https://github.com/alibaba/GraphAr.git + $ git submodule update --init + $ cd GraphAr/cpp + +Release build: + +.. code-block:: + + $ mkdir build-release + $ cd build-release + $ cmake .. + $ make -j8 # if you have 8 CPU cores, otherwise adjust + +Build with a custom namespace: + +The :code:`namespace` is configurable. By default, +it is defined in :code:`namespace GraphArchive`; however this can be toggled by +setting :code:`NAMESPACE` option with cmake: + +.. code:: shell + + $ mkdir build + $ cd build + $ cmake .. -DNAMESPACE=MyNamespace + $ make -j8 # if you have 8 CPU cores, otherwise adjust + +Debug build with unit tests: + +.. code-block:: shell + + $ export GAR_TEST_DATA=$PWD/../testing/ + $ mkdir build-debug + $ cd build-debug + $ cmake -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON .. + $ make -j8 # if you have 8 CPU cores, otherwise adjust + $ make test # to run the tests + +Build with examples: + +.. code-block:: shell + + $ export GAR_TEST_DATA=$PWD/../testing/ + $ mkdir build-examples + $ cd build-examples + $ cmake -DBUILD_EXAMPLES=ON .. + $ make -j8 # if you have 8 CPU cores, otherwise adjust + +Install +^^^^^^^^^ + +After the building, you can install the GraphAr C++ library with: + +.. code-block:: shell + + $ sudo make install + +Generate API document +^^^^^^^^^^^^^^^^^^^^^ + +Building the API document with Doxgen: + +.. code-block:: shell + + $ pushd apidoc + $ doxgen + $ popd + +The API document is generated in the directory ``cpp/apidoc/html``. diff --git a/cpp/apidoc/.gitignore b/cpp/apidoc/.gitignore new file mode 100644 index 000000000..cebc3fb97 --- /dev/null +++ b/cpp/apidoc/.gitignore @@ -0,0 +1,2 @@ +html +xml diff --git a/docs/Doxyfile b/cpp/apidoc/Doxyfile similarity index 99% rename from docs/Doxyfile rename to cpp/apidoc/Doxyfile index f099e8368..e515f33f9 100644 --- a/docs/Doxyfile +++ b/cpp/apidoc/Doxyfile @@ -58,7 +58,7 @@ PROJECT_LOGO = # entered, it will be relative to the location where doxygen was started. If # left blank the current directory will be used. -OUTPUT_DIRECTORY = _build/doxygen +OUTPUT_DIRECTORY = $(OUTPUT_DIRECTORY) # If the CREATE_SUBDIRS tag is set to YES then doxygen will create 4096 sub- # directories (in 2 levels) under the output directory of each output format and diff --git a/cmake/apache-arrow.cmake b/cpp/cmake/apache-arrow.cmake similarity index 98% rename from cmake/apache-arrow.cmake rename to cpp/cmake/apache-arrow.cmake index b297f1dc5..48ace5f70 100644 --- a/cmake/apache-arrow.cmake +++ b/cpp/cmake/apache-arrow.cmake @@ -84,7 +84,7 @@ function(build_arrow) include(ExternalProject) externalproject_add(arrow_ep - URL https://www.apache.org/dyn/closer.lua?action=download&filename=arrow/arrow-9.0.0/apache-arrow-9.0.0.tar.gz + URL https://www.apache.org/dyn/closer.lua?action=download&filename=arrow/arrow-11.0.0/apache-arrow-11.0.0.tar.gz SOURCE_SUBDIR cpp BINARY_DIR "${GAR_ARROW_BINARY_DIR}" CMAKE_ARGS "${GAR_ARROW_CMAKE_ARGS}" diff --git a/examples/bfs_father_example.cc b/cpp/examples/bfs_father_example.cc similarity index 100% rename from examples/bfs_father_example.cc rename to cpp/examples/bfs_father_example.cc diff --git a/examples/bfs_pull_example.cc b/cpp/examples/bfs_pull_example.cc similarity index 100% rename from examples/bfs_pull_example.cc rename to cpp/examples/bfs_pull_example.cc diff --git a/examples/bfs_push_example.cc b/cpp/examples/bfs_push_example.cc similarity index 100% rename from examples/bfs_push_example.cc rename to cpp/examples/bfs_push_example.cc diff --git a/examples/bfs_stream_example.cc b/cpp/examples/bfs_stream_example.cc similarity index 100% rename from examples/bfs_stream_example.cc rename to cpp/examples/bfs_stream_example.cc diff --git a/examples/bgl_example.cc b/cpp/examples/bgl_example.cc similarity index 100% rename from examples/bgl_example.cc rename to cpp/examples/bgl_example.cc diff --git a/examples/cc_push_example.cc b/cpp/examples/cc_push_example.cc similarity index 100% rename from examples/cc_push_example.cc rename to cpp/examples/cc_push_example.cc diff --git a/examples/cc_stream_example.cc b/cpp/examples/cc_stream_example.cc similarity index 100% rename from examples/cc_stream_example.cc rename to cpp/examples/cc_stream_example.cc diff --git a/examples/config.h b/cpp/examples/config.h similarity index 100% rename from examples/config.h rename to cpp/examples/config.h diff --git a/examples/construct_info_example.cc b/cpp/examples/construct_info_example.cc similarity index 100% rename from examples/construct_info_example.cc rename to cpp/examples/construct_info_example.cc diff --git a/examples/pagerank_example.cc b/cpp/examples/pagerank_example.cc similarity index 100% rename from examples/pagerank_example.cc rename to cpp/examples/pagerank_example.cc diff --git a/gar-config-version.in.cmake b/cpp/gar-config-version.in.cmake similarity index 100% rename from gar-config-version.in.cmake rename to cpp/gar-config-version.in.cmake diff --git a/gar-config.in.cmake b/cpp/gar-config.in.cmake similarity index 100% rename from gar-config.in.cmake rename to cpp/gar-config.in.cmake diff --git a/include/gar/external/result.hpp b/cpp/include/gar/external/result.hpp similarity index 100% rename from include/gar/external/result.hpp rename to cpp/include/gar/external/result.hpp diff --git a/include/gar/graph.h b/cpp/include/gar/graph.h similarity index 100% rename from include/gar/graph.h rename to cpp/include/gar/graph.h diff --git a/include/gar/graph_info.h b/cpp/include/gar/graph_info.h similarity index 100% rename from include/gar/graph_info.h rename to cpp/include/gar/graph_info.h diff --git a/include/gar/reader/arrow_chunk_reader.h b/cpp/include/gar/reader/arrow_chunk_reader.h similarity index 100% rename from include/gar/reader/arrow_chunk_reader.h rename to cpp/include/gar/reader/arrow_chunk_reader.h diff --git a/include/gar/reader/chunk_info_reader.h b/cpp/include/gar/reader/chunk_info_reader.h similarity index 100% rename from include/gar/reader/chunk_info_reader.h rename to cpp/include/gar/reader/chunk_info_reader.h diff --git a/include/gar/utils/adj_list_type.h b/cpp/include/gar/utils/adj_list_type.h similarity index 100% rename from include/gar/utils/adj_list_type.h rename to cpp/include/gar/utils/adj_list_type.h diff --git a/include/gar/utils/convert_to_arrow_type.h b/cpp/include/gar/utils/convert_to_arrow_type.h similarity index 100% rename from include/gar/utils/convert_to_arrow_type.h rename to cpp/include/gar/utils/convert_to_arrow_type.h diff --git a/include/gar/utils/data_type.h b/cpp/include/gar/utils/data_type.h similarity index 100% rename from include/gar/utils/data_type.h rename to cpp/include/gar/utils/data_type.h diff --git a/include/gar/utils/file_type.h b/cpp/include/gar/utils/file_type.h similarity index 100% rename from include/gar/utils/file_type.h rename to cpp/include/gar/utils/file_type.h diff --git a/include/gar/utils/filesystem.h b/cpp/include/gar/utils/filesystem.h similarity index 100% rename from include/gar/utils/filesystem.h rename to cpp/include/gar/utils/filesystem.h diff --git a/include/gar/utils/general_params.h b/cpp/include/gar/utils/general_params.h similarity index 100% rename from include/gar/utils/general_params.h rename to cpp/include/gar/utils/general_params.h diff --git a/include/gar/utils/macros.h b/cpp/include/gar/utils/macros.h similarity index 100% rename from include/gar/utils/macros.h rename to cpp/include/gar/utils/macros.h diff --git a/include/gar/utils/reader_utils.h b/cpp/include/gar/utils/reader_utils.h similarity index 100% rename from include/gar/utils/reader_utils.h rename to cpp/include/gar/utils/reader_utils.h diff --git a/include/gar/utils/result.h b/cpp/include/gar/utils/result.h similarity index 100% rename from include/gar/utils/result.h rename to cpp/include/gar/utils/result.h diff --git a/include/gar/utils/status.h b/cpp/include/gar/utils/status.h similarity index 100% rename from include/gar/utils/status.h rename to cpp/include/gar/utils/status.h diff --git a/include/gar/utils/utils.h b/cpp/include/gar/utils/utils.h similarity index 100% rename from include/gar/utils/utils.h rename to cpp/include/gar/utils/utils.h diff --git a/include/gar/utils/version_parser.h b/cpp/include/gar/utils/version_parser.h similarity index 100% rename from include/gar/utils/version_parser.h rename to cpp/include/gar/utils/version_parser.h diff --git a/include/gar/utils/yaml.h b/cpp/include/gar/utils/yaml.h similarity index 100% rename from include/gar/utils/yaml.h rename to cpp/include/gar/utils/yaml.h diff --git a/include/gar/writer/arrow_chunk_writer.h b/cpp/include/gar/writer/arrow_chunk_writer.h similarity index 100% rename from include/gar/writer/arrow_chunk_writer.h rename to cpp/include/gar/writer/arrow_chunk_writer.h diff --git a/include/gar/writer/edges_builder.h b/cpp/include/gar/writer/edges_builder.h similarity index 100% rename from include/gar/writer/edges_builder.h rename to cpp/include/gar/writer/edges_builder.h diff --git a/include/gar/writer/vertices_builder.h b/cpp/include/gar/writer/vertices_builder.h similarity index 100% rename from include/gar/writer/vertices_builder.h rename to cpp/include/gar/writer/vertices_builder.h diff --git a/misc/cpplint.py b/cpp/misc/cpplint.py similarity index 100% rename from misc/cpplint.py rename to cpp/misc/cpplint.py diff --git a/src/arrow_chunk_reader.cc b/cpp/src/arrow_chunk_reader.cc similarity index 100% rename from src/arrow_chunk_reader.cc rename to cpp/src/arrow_chunk_reader.cc diff --git a/src/arrow_chunk_writer.cc b/cpp/src/arrow_chunk_writer.cc similarity index 100% rename from src/arrow_chunk_writer.cc rename to cpp/src/arrow_chunk_writer.cc diff --git a/src/chunk_info_reader.cc b/cpp/src/chunk_info_reader.cc similarity index 100% rename from src/chunk_info_reader.cc rename to cpp/src/chunk_info_reader.cc diff --git a/src/data_type.cc b/cpp/src/data_type.cc similarity index 100% rename from src/data_type.cc rename to cpp/src/data_type.cc diff --git a/src/edges_builder.cc b/cpp/src/edges_builder.cc similarity index 100% rename from src/edges_builder.cc rename to cpp/src/edges_builder.cc diff --git a/src/filesystem.cc b/cpp/src/filesystem.cc similarity index 100% rename from src/filesystem.cc rename to cpp/src/filesystem.cc diff --git a/src/graph.cc b/cpp/src/graph.cc similarity index 100% rename from src/graph.cc rename to cpp/src/graph.cc diff --git a/src/graph_info.cc b/cpp/src/graph_info.cc similarity index 100% rename from src/graph_info.cc rename to cpp/src/graph_info.cc diff --git a/src/reader_utils.cc b/cpp/src/reader_utils.cc similarity index 100% rename from src/reader_utils.cc rename to cpp/src/reader_utils.cc diff --git a/src/utils.cc b/cpp/src/utils.cc similarity index 100% rename from src/utils.cc rename to cpp/src/utils.cc diff --git a/src/version_parser.cc b/cpp/src/version_parser.cc similarity index 100% rename from src/version_parser.cc rename to cpp/src/version_parser.cc diff --git a/src/vertices_builder.cc b/cpp/src/vertices_builder.cc similarity index 100% rename from src/vertices_builder.cc rename to cpp/src/vertices_builder.cc diff --git a/src/yaml.cc b/cpp/src/yaml.cc similarity index 100% rename from src/yaml.cc rename to cpp/src/yaml.cc diff --git a/test/test_arrow_chunk_reader.cc b/cpp/test/test_arrow_chunk_reader.cc similarity index 92% rename from test/test_arrow_chunk_reader.cc rename to cpp/test/test_arrow_chunk_reader.cc index 723dd7646..39f21f1ef 100644 --- a/test/test_arrow_chunk_reader.cc +++ b/cpp/test/test_arrow_chunk_reader.cc @@ -24,7 +24,7 @@ limitations under the License. #include "arrow/util/uri.h" #include "parquet/arrow/writer.h" -#include "./config.h" +#include "./util.h" #include "gar/reader/arrow_chunk_reader.h" #include "gar/writer/arrow_chunk_writer.h" @@ -32,9 +32,11 @@ limitations under the License. #include TEST_CASE("test_vertex_property_arrow_chunk_reader") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + // read file and construct graph info - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); @@ -89,9 +91,11 @@ TEST_CASE("test_vertex_property_arrow_chunk_reader") { } TEST_CASE("test_adj_list_arrow_chunk_reader") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + // read file and construct graph info - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); @@ -142,8 +146,10 @@ TEST_CASE("test_adj_list_arrow_chunk_reader") { } TEST_CASE("test_adj_list_property_arrow_chunk_reader") { - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); @@ -196,9 +202,11 @@ TEST_CASE("test_adj_list_property_arrow_chunk_reader") { } TEST_CASE("test_read_adj_list_offset_chunk_example") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + // read file and construct graph info - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); diff --git a/test/test_arrow_chunk_writer.cc b/cpp/test/test_arrow_chunk_writer.cc similarity index 91% rename from test/test_arrow_chunk_writer.cc rename to cpp/test/test_arrow_chunk_writer.cc index 9e0295ff4..088b7bd73 100644 --- a/test/test_arrow_chunk_writer.cc +++ b/cpp/test/test_arrow_chunk_writer.cc @@ -27,7 +27,7 @@ limitations under the License. #include "parquet/arrow/reader.h" #include "parquet/arrow/writer.h" -#include "./config.h" +#include "./util.h" #include "gar/graph_info.h" #include "gar/writer/arrow_chunk_writer.h" @@ -35,8 +35,11 @@ limitations under the License. #include TEST_CASE("test_vertex_property_wrtier_from_file") { - REQUIRE(!TEST_DATA_DIR.empty()); - std::string path = TEST_DATA_DIR + "/ldbc_sample/person_0_0.csv"; + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + + REQUIRE(!root.empty()); + std::string path = root + "/ldbc_sample/person_0_0.csv"; arrow::io::IOContext io_context = arrow::io::default_io_context(); auto fs = arrow::fs::FileSystemFromUriOrPath(path).ValueOrDie(); @@ -61,7 +64,7 @@ TEST_CASE("test_vertex_property_wrtier_from_file") { std::cout << table->num_rows() << ' ' << table->num_columns() << std::endl; std::string vertex_meta_file = - TEST_DATA_DIR + "/ldbc_sample/parquet/" + "person.vertex.yml"; + root + "/ldbc_sample/parquet/" + "person.vertex.yml"; auto vertex_meta = GAR_NAMESPACE::Yaml::LoadFile(vertex_meta_file).value(); auto vertex_info = GAR_NAMESPACE::VertexInfo::Load(vertex_meta).value(); REQUIRE(vertex_info.GetLabel() == "person"); @@ -76,11 +79,14 @@ TEST_CASE("test_vertex_property_wrtier_from_file") { } TEST_CASE("test_orc_and_parquet_reader") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + arrow::Status st; arrow::MemoryPool* pool = arrow::default_memory_pool(); - std::string path1 = TEST_DATA_DIR + "/ldbc_sample/orc" + + std::string path1 = root + "/ldbc_sample/orc" + "/vertex/person/firstName_lastName_gender/chunk1"; - std::string path2 = TEST_DATA_DIR + "/ldbc_sample/parquet" + + std::string path2 = root + "/ldbc_sample/parquet" + "/vertex/person/firstName_lastName_gender/chunk1"; arrow::io::IOContext io_context = arrow::io::default_io_context(); @@ -115,9 +121,12 @@ TEST_CASE("test_orc_and_parquet_reader") { } TEST_CASE("test_edge_chunk_writer") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + arrow::Status st; arrow::MemoryPool* pool = arrow::default_memory_pool(); - std::string path = TEST_DATA_DIR + + std::string path = root + "/ldbc_sample/parquet/edge/person_knows_person/" "unordered_by_source/adj_list/part0/chunk0"; auto fs = arrow::fs::FileSystemFromUriOrPath(path).ValueOrDie(); @@ -140,7 +149,7 @@ TEST_CASE("test_edge_chunk_writer") { // Write edges of vertex chunk 0 to files std::string edge_meta_file = - TEST_DATA_DIR + "/ldbc_sample/csv/" + "person_knows_person.edge.yml"; + root + "/ldbc_sample/csv/" + "person_knows_person.edge.yml"; auto edge_meta = GAR_NAMESPACE::Yaml::LoadFile(edge_meta_file).value(); auto edge_info = GAR_NAMESPACE::EdgeInfo::Load(edge_meta).value(); GAR_NAMESPACE::EdgeChunkWriter writer( diff --git a/test/test_builder.cc b/cpp/test/test_builder.cc similarity index 88% rename from test/test_builder.cc rename to cpp/test/test_builder.cc index 296429740..ac33e83e7 100644 --- a/test/test_builder.cc +++ b/cpp/test/test_builder.cc @@ -28,7 +28,7 @@ limitations under the License. #include "arrow/util/uri.h" #include "parquet/arrow/writer.h" -#include "./config.h" +#include "./util.h" #include "gar/graph_info.h" #include "gar/writer/arrow_chunk_writer.h" #include "gar/writer/edges_builder.h" @@ -38,15 +38,18 @@ limitations under the License. #include TEST_CASE("test_vertices_builder") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + std::string vertex_meta_file = - TEST_DATA_DIR + "/ldbc_sample/parquet/" + "person.vertex.yml"; + root + "/ldbc_sample/parquet/" + "person.vertex.yml"; auto vertex_meta = GAR_NAMESPACE::Yaml::LoadFile(vertex_meta_file).value(); auto vertex_info = GAR_NAMESPACE::VertexInfo::Load(vertex_meta).value(); GAR_NAMESPACE::IdType start_index = 0; GAR_NAMESPACE::builder::VerticesBuilder builder(vertex_info, "/tmp/", start_index); - std::ifstream fp(TEST_DATA_DIR + "/ldbc_sample/person_0_0.csv"); + std::ifstream fp(root + "/ldbc_sample/person_0_0.csv"); std::string line; getline(fp, line); int m = 4; @@ -78,7 +81,7 @@ TEST_CASE("test_vertices_builder") { } REQUIRE(builder.Dump().ok()); - auto fs = arrow::fs::FileSystemFromUriOrPath(TEST_DATA_DIR).ValueOrDie(); + auto fs = arrow::fs::FileSystemFromUriOrPath(root).ValueOrDie(); auto input = fs->OpenInputStream("/tmp/vertex/person/vertex_count").ValueOrDie(); auto num = input->Read(sizeof(GAR_NAMESPACE::IdType)).ValueOrDie(); @@ -87,14 +90,17 @@ TEST_CASE("test_vertices_builder") { } TEST_CASE("test_edges_builder") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + std::string edge_meta_file = - TEST_DATA_DIR + "/ldbc_sample/parquet/" + "person_knows_person.edge.yml"; + root + "/ldbc_sample/parquet/" + "person_knows_person.edge.yml"; auto edge_meta = GAR_NAMESPACE::Yaml::LoadFile(edge_meta_file).value(); auto edge_info = GAR_NAMESPACE::EdgeInfo::Load(edge_meta).value(); GAR_NAMESPACE::builder::EdgesBuilder builder( edge_info, "/tmp/", GraphArchive::AdjListType::ordered_by_dest, 903); - std::ifstream fp(TEST_DATA_DIR + "/ldbc_sample/person_knows_person_0_0.csv"); + std::ifstream fp(root + "/ldbc_sample/person_knows_person_0_0.csv"); std::string line; getline(fp, line); std::vector names; diff --git a/test/test_chunk_info_reader.cc b/cpp/test/test_chunk_info_reader.cc similarity index 89% rename from test/test_chunk_info_reader.cc rename to cpp/test/test_chunk_info_reader.cc index a01373c08..7762e4abc 100644 --- a/test/test_chunk_info_reader.cc +++ b/cpp/test/test_chunk_info_reader.cc @@ -15,16 +15,18 @@ limitations under the License. #include -#include "./config.h" +#include "./util.h" #include "gar/reader/chunk_info_reader.h" #define CATCH_CONFIG_MAIN #include TEST_CASE("test_vertex_property_chunk_info_reader") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + // read file and construct graph info - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); @@ -46,25 +48,21 @@ TEST_CASE("test_vertex_property_chunk_info_reader") { auto maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); std::string chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == - TEST_DATA_DIR + "/ldbc_sample/parquet/vertex/person/id/chunk0"); + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/vertex/person/id/chunk0"); REQUIRE(reader.seek(520).ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == - TEST_DATA_DIR + "/ldbc_sample/parquet/vertex/person/id/chunk5"); + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/vertex/person/id/chunk5"); REQUIRE(reader.next_chunk().ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == - TEST_DATA_DIR + "/ldbc_sample/parquet/vertex/person/id/chunk6"); + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/vertex/person/id/chunk6"); REQUIRE(reader.seek(900).ok()); maybe_chunk_path = reader.GetChunk(); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == - TEST_DATA_DIR + "/ldbc_sample/parquet/vertex/person/id/chunk9"); + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/vertex/person/id/chunk9"); // now is end of the chunks REQUIRE(reader.next_chunk().IsOutOfRange()); @@ -77,9 +75,11 @@ TEST_CASE("test_vertex_property_chunk_info_reader") { } TEST_CASE("test_adj_list_chunk_info_reader") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + // read file and construct graph info - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); @@ -98,21 +98,21 @@ TEST_CASE("test_adj_list_chunk_info_reader") { auto maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); auto chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/adj_list/part0/chunk0"); REQUIRE(reader.seek(100).ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/adj_list/part0/chunk0"); REQUIRE(reader.next_chunk().ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/adj_list/part1/chunk0"); @@ -121,14 +121,14 @@ TEST_CASE("test_adj_list_chunk_info_reader") { maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/adj_list/part1/chunk0"); REQUIRE(reader.seek_src(900).ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/adj_list/part9/chunk0"); REQUIRE(reader.next_chunk().IsOutOfRange()); @@ -147,7 +147,7 @@ TEST_CASE("test_adj_list_chunk_info_reader") { maybe_chunk_path = dst_reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_dest/adj_list/part1/chunk0"); @@ -157,9 +157,11 @@ TEST_CASE("test_adj_list_chunk_info_reader") { } TEST_CASE("test_adj_list_property_chunk_info_reader") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + // read file and construct graph info - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); @@ -183,21 +185,21 @@ TEST_CASE("test_adj_list_property_chunk_info_reader") { auto maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); auto chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/creationDate/part0/chunk0"); REQUIRE(reader.seek(100).ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/creationDate/part0/chunk0"); REQUIRE(reader.next_chunk().ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/creationDate/part1/chunk0"); @@ -206,14 +208,14 @@ TEST_CASE("test_adj_list_property_chunk_info_reader") { maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/creationDate/part1/chunk0"); REQUIRE(reader.seek_src(900).ok()); maybe_chunk_path = reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_source/creationDate/part9/chunk0"); REQUIRE(reader.next_chunk().IsOutOfRange()); @@ -238,7 +240,7 @@ TEST_CASE("test_adj_list_property_chunk_info_reader") { maybe_chunk_path = dst_reader.GetChunk(); REQUIRE(maybe_chunk_path.status().ok()); chunk_path = maybe_chunk_path.value(); - REQUIRE(chunk_path == TEST_DATA_DIR + + REQUIRE(chunk_path == root + "/ldbc_sample/parquet/edge/person_knows_person/" "ordered_by_dest/creationDate/part1/chunk0"); diff --git a/test/test_graph.cc b/cpp/test/test_graph.cc similarity index 91% rename from test/test_graph.cc rename to cpp/test/test_graph.cc index b96475af7..52e95a6ce 100644 --- a/test/test_graph.cc +++ b/cpp/test/test_graph.cc @@ -15,16 +15,18 @@ limitations under the License. #include -#include "./config.h" +#include "./util.h" #include "gar/graph.h" #define CATCH_CONFIG_MAIN #include TEST_CASE("test_vertices_collection") { + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + // read file and construct graph info - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; auto maybe_graph_info = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(maybe_graph_info.status().ok()); auto graph_info = maybe_graph_info.value(); @@ -50,8 +52,10 @@ TEST_CASE("test_vertices_collection") { } TEST_CASE("test_edges_collection", "[Slow]") { - std::string path = - TEST_DATA_DIR + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + + std::string path = root + "/ldbc_sample/parquet/ldbc_sample.graph.yml"; std::string src_label = "person", edge_label = "knows", dst_label = "person"; auto graph_info = GAR_NAMESPACE::GraphInfo::Load(path).value(); diff --git a/test/test_info.cc b/cpp/test/test_info.cc similarity index 98% rename from test/test_info.cc rename to cpp/test/test_info.cc index 95ee5dca2..fcb27c365 100644 --- a/test/test_info.cc +++ b/cpp/test/test_info.cc @@ -18,7 +18,7 @@ limitations under the License. #include #include -#include "./config.h" +#include "./util.h" #include "gar/graph_info.h" #include "gar/utils/version_parser.h" @@ -318,12 +318,15 @@ TEST_CASE("test_info_version") { } TEST_CASE("test_graph_info_load_from_file") { - std::string path = TEST_DATA_DIR + "/ldbc_sample/csv/ldbc_sample.graph.yml"; + std::string root; + REQUIRE(GetTestResourceRoot(&root).ok()); + + std::string path = root + "/ldbc_sample/csv/ldbc_sample.graph.yml"; auto graph_info_result = GAR_NAMESPACE::GraphInfo::Load(path); REQUIRE(!graph_info_result.has_error()); auto graph_info = graph_info_result.value(); REQUIRE(graph_info.GetName() == "ldbc_sample"); - REQUIRE(graph_info.GetPrefix() == TEST_DATA_DIR + "/ldbc_sample/csv/"); + REQUIRE(graph_info.GetPrefix() == root + "/ldbc_sample/csv/"); const auto& vertex_infos = graph_info.GetVertexInfos(); const auto& edge_infos = graph_info.GetEdgeInfos(); REQUIRE(vertex_infos.size() == 1); diff --git a/test/config.h b/cpp/test/util.h similarity index 54% rename from test/config.h rename to cpp/test/util.h index 2c00d6de7..a4c407e1d 100644 --- a/test/config.h +++ b/cpp/test/util.h @@ -16,10 +16,21 @@ limitations under the License. #include #include -#ifndef TEST_CONFIG_H_ -#define TEST_CONFIG_H_ +#include "gar/utils/status.h" -static const std::string TEST_DATA_DIR = // NOLINT - std::filesystem::path(__FILE__).parent_path().string() + "/gar-test"; +#ifndef CPP_TEST_UTIL_H_ +#define CPP_TEST_UTIL_H_ -#endif // TEST_CONFIG_H_ +// Return the value of the GAR_TEST_DATA environment variable or return error +// Status +GAR_NAMESPACE::Status GetTestResourceRoot(std::string* out) { + const char* c_root = std::getenv("GAR_TEST_DATA"); + if (!c_root) { + return GAR_NAMESPACE::Status::IOError( + "Test resources not found, set GAR_TEST_DATA to /testing"); + } + *out = std::string(c_root); + return GAR_NAMESPACE::Status::OK(); +} + +#endif // CPP_TEST_UTIL_H_ diff --git a/thirdparty/Catch2 b/cpp/thirdparty/Catch2 similarity index 100% rename from thirdparty/Catch2 rename to cpp/thirdparty/Catch2 diff --git a/thirdparty/result.hpp b/cpp/thirdparty/result.hpp similarity index 100% rename from thirdparty/result.hpp rename to cpp/thirdparty/result.hpp diff --git a/thirdparty/yaml-cpp b/cpp/thirdparty/yaml-cpp similarity index 100% rename from thirdparty/yaml-cpp rename to cpp/thirdparty/yaml-cpp diff --git a/docs/.gitignore b/docs/.gitignore new file mode 100644 index 000000000..e35d8850c --- /dev/null +++ b/docs/.gitignore @@ -0,0 +1 @@ +_build diff --git a/docs/Makefile b/docs/Makefile index 746b49bc1..e4d6e7445 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -3,11 +3,17 @@ # You can set these variables from the command line, and also # from the environment for the first two. -SPHINXOPTS ?= -SPHINXBUILD ?= sphinx-build +# Do not fail the build if there are warnings +SPHINXOPTS = -j8 +SPHINXBUILD ?= sphinx-build SOURCEDIR = . BUILDDIR = _build +# Internal variables. +ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(SPHINXOPTS) $(SOURCEDIR) +DOXYGEN = doxygen +ROOTDIR = .. + # Put it first so that "make" without argument is like "make help". help: @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) @@ -19,6 +25,26 @@ help: %: Makefile @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) -doxygen: - @mkdir -p _build - @doxygen Doxyfile +.PHONY: clean +clean: + rm -rf $(BUILDDIR)/* + +.PHONY: cpp-apidoc +cpp-apidoc: + pushd $(ROOTDIR)/cpp/apidoc && \ + $(DOXYGEN) Doxyfile && \ + popd + +.PHONY: spark-apidoc +spark-apidoc: + pushd $(ROOTDIR)/spark && \ + mvn scala:doc && \ + popd + +.PHONY: html +html: cpp-apidoc spark-apidoc + $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html + rm -fr $(BUILDDIR)/html/reference/spark-api + cp -fr $(ROOTDIR)/spark/target/site/scaladocs $(BUILDDIR)/html/reference/spark-api + @echo + @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." diff --git a/docs/applications/bgl.rst b/docs/applications/bgl.rst index a06134c44..0658521b1 100644 --- a/docs/applications/bgl.rst +++ b/docs/applications/bgl.rst @@ -90,4 +90,4 @@ Finally, we could use a **VerticesBuilder** of GraphAr to write the results to n builder.Dump(); -.. _bgl_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/bgl_example.cc +.. _bgl_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bgl_example.cc diff --git a/docs/applications/out-of-core.rst b/docs/applications/out-of-core.rst index 91dda27c4..bd96ef8b9 100644 --- a/docs/applications/out-of-core.rst +++ b/docs/applications/out-of-core.rst @@ -16,7 +16,7 @@ PageRank `PageRank (PR) `_ is an algorithm used by Google Search to rank web pages in their search engine results. The source code of PageRank based on GraphAr located at `pagerank_example.cc`_, and the explanations can be found in the `Getting Started <../user-guide/getting-started.html#a-pagerank-example>`_ page. -Connected Components +Connected Components ------------------------ A weakly connected component is a maximal subgraph of a graph such that for every pair of vertices in it, there is an undirected path connecting them. And `Connected Components (CC) `_ is an algorithm to identify all weakly connected components in a graph. `CC based on BGL `_ is provided in GraphAr, also, we implement out-of-core algorithms for this application. @@ -35,7 +35,7 @@ This algorithm can be implemented based on streaming the edges via GraphAr's rea std::vector component(num_vertices); for (GraphArchive::IdType i = 0; i < num_vertices; i++) component[i] = i; - + // stream all edges for each iteration for (int iter = 0; ; iter++) { bool flag = false; @@ -56,9 +56,9 @@ This algorithm can be implemented based on streaming the edges via GraphAr's rea The file `cc_stream_example.cc`_ located inside the source tree contains the complete implementation for this algorithm. Also, we can only process active vertices (the vertices which are updated in the last iteration) and the corresponding edges for each iteration, since an inactive vertex does not need to update its neighbors. Please refer to `cc_push_example.cc`_ for the complete code. -.. tip:: +.. tip:: - In this example, two kinds of edges are used. The **ordered_by_source** edges are used to access all outgoing edges of an active vertex, and **ordered_by_dest** edges are used to access the incoming edges. In this way, all the neighbors of an active vertex can be accessed and processed. + In this example, two kinds of edges are used. The **ordered_by_source** edges are used to access all outgoing edges of an active vertex, and **ordered_by_dest** edges are used to access the incoming edges. In this way, all the neighbors of an active vertex can be accessed and processed. Although GraphAr supports to get the outgoing (incoming) edges of a single vertex for all adjList types, it is most efficient when the type is **ordered_by_source** (**ordered_by_dest**) since it can avoid to read redundant data. @@ -100,23 +100,23 @@ The above algorithm is implemented based on streaming all edges for each iterati Meanwhile, BFS could be implemented in a **push**-style which only traverses the edges that from active vertices for each iteration, which is typically more efficient on real-world graphs. This implementation can be found at `bfs_push_example.cc`_. Similarly, we provide a BFS implementation in a **pull**-style which only traverses the edges that lead to non-visited vertices (i.e., the vertices that have not been traversed), as shown in `bfs_pull_example.cc`_. -.. tip:: +.. tip:: In common cases of graph processing, the **push**-style is more efficient when the set of active vertices is very sparse, while the **pull**-style fits when it is dense. In some cases, it is required to record the path of BFS, that is, to maintain each vertex's predecessor (also called *father*) in the traversing tree rather than only recording the distance. The implementation of BFS with recording fathers can be found at `bfs_father_example.cc`_. -.. _pagerank_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/pagerank_example.cc +.. _pagerank_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/pagerank_example.cc -.. _cc_stream_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/cc_stream_example.cc +.. _cc_stream_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/cc_stream_example.cc -.. _cc_push_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/cc_push_example.cc +.. _cc_push_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/cc_push_example.cc -.. _bfs_stream_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/bfs_stream_example.cc +.. _bfs_stream_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_stream_example.cc -.. _bfs_push_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/bfs_push_example.cc +.. _bfs_push_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_push_example.cc -.. _bfs_pull_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/bfs_pull_example.cc +.. _bfs_pull_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_pull_example.cc -.. _bfs_father_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/bfs_father_example.cc +.. _bfs_father_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/bfs_father_example.cc diff --git a/docs/conf.py b/docs/conf.py index f607453e4..c70792671 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -45,7 +45,7 @@ # breathe breathe_projects = { - 'GraphAr': os.path.abspath('./_build/doxygen/xml'), + 'GraphAr': os.path.abspath('../cpp/apidoc/xml'), } breathe_default_project = 'GraphAr' breathe_debug_trace_directives = True diff --git a/requirements-dev.txt b/docs/requirements.txt similarity index 100% rename from requirements-dev.txt rename to docs/requirements.txt diff --git a/docs/user-guide/getting-started.rst b/docs/user-guide/getting-started.rst index 1878f5918..2d338a90a 100644 --- a/docs/user-guide/getting-started.rst +++ b/docs/user-guide/getting-started.rst @@ -194,6 +194,6 @@ Please refer to `more examples <../applications/out-of-core.html>`_ to learn abo .. _./edge/person_knows_person/ordered_by_source/offset/chunk0: https://github.com/GraphScope/gar-test/blob/main/ldbc_sample/csv/edge/person_knows_person/ordered_by_source/offset/chunk0 -.. _example program: https://github.com/alibaba/GraphAr/blob/main/examples/construct_info_example.cc +.. _example program: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/construct_info_example.cc -.. _pagerank_example.cc: https://github.com/alibaba/GraphAr/blob/main/examples/pagerank_example.cc +.. _pagerank_example.cc: https://github.com/alibaba/GraphAr/blob/main/cpp/examples/pagerank_example.cc diff --git a/spark/.gitignore b/spark/.gitignore new file mode 100644 index 000000000..eb5a316cb --- /dev/null +++ b/spark/.gitignore @@ -0,0 +1 @@ +target diff --git a/spark/README.rst b/spark/README.rst new file mode 100644 index 000000000..c5bdafe18 --- /dev/null +++ b/spark/README.rst @@ -0,0 +1,76 @@ +GraphAr Spark +============= +This directory contains the code and build system for the GraphAr Spark library. + + +Building GraphAr Spark +-------------------- + +System setup +^^^^^^^^^^^^^ + +GraphAr Spark uses maven as a package build system. + +Building requires: + +* JDK 8 or higher +* Maven 3.2.0 or higher + +Building +^^^^^^^^^ + +All the instructions below assume that you have cloned the GraphAr git +repository and navigated to the ``spark`` subdirectory: + +.. code-block:: + + $ git clone https://github.com/alibaba/GraphAr.git + $ git submodule update --init + $ cd GraphAr/spark + +Build the package: + +.. code-block:: + + $ mvn clean package -DskipTests + +After compilation, the package file graphar-x.x.x-SNAPSHOT-shaded.jar is generated in the directory ``spark/target/``. + + +Build the package and run the unit tests: + +.. code-block:: + + $ mvn clean package + +Build and run the unit tests: + +.. code-block:: + + $ mvn clean test + +Build and run certain unit test: + +.. code-block:: + + $ mvn clean test -Dsuites='com.alibaba.graphar.GraphInfoSuite' # run the GraphInfo test suite + $ mvn clean test -Dsuites='com.alibaba.graphar.GraphInfoSuite load graph info' # run the `load graph info` test of test suite + + +Generate API document +^^^^^^^^^^^^^^^^^^^^ + +Building the API document with maven: + +.. code-block:: shell + + $ mvn scala:doc + +The API document is generated in the directory ``spark/target/site/scaladocs``. + +How to use +^^^^^^^^^^^ + +Please refer to our `GraphAr Spark Library Documentation`_. + +.. _GraphAr Spark Library Documentation: https://alibaba.github.io/GraphAr/user-guide/spark-lib.html diff --git a/spark/src/test/resources/gar-test b/spark/src/test/resources/gar-test index 952da9d1f..1166084dd 120000 --- a/spark/src/test/resources/gar-test +++ b/spark/src/test/resources/gar-test @@ -1 +1 @@ -../../../../test/gar-test \ No newline at end of file +../../../../testing \ No newline at end of file diff --git a/test/gar-test b/testing similarity index 100% rename from test/gar-test rename to testing