Skip to content

Commit 35f8a34

Browse files
xhochywesm
authored andcommitted
ARROW-3834: [Doc] Merge C++ and Python documentation
Author: Uwe L. Korn <[email protected]> Author: Korn, Uwe <[email protected]> Author: Wes McKinney <[email protected]> Closes apache#2856 from xhochy/doc-merge and squashes the following commits: 5b687ff <Wes McKinney> Add simple README for the format/ directory 071d16a <Uwe L. Korn> Move format specifications back to /format/ 337088b <Uwe L. Korn> Review comments fbe99c9 <Uwe L. Korn> Add more C++ docs 78a5eaf <Uwe L. Korn> Fix Python docs build 0b4dd33 <Uwe L. Korn> Rename doc to docs 918e762 <Uwe L. Korn> Convert format docs to reST 7aeff65 <Uwe L. Korn> Add doc generation to docker-compose 185cba8 <Uwe L. Korn> Add pre-commit check for RAT 671d244 <Uwe L. Korn> Fix references to format documents bdd824c <Uwe L. Korn> Move doc to top-level 985d428 <Uwe L. Korn> Move Sphinx doc to top-level directory f7d5e92 <Uwe L. Korn> Build C++ API docs 7850db8 <Uwe L. Korn> Add breathe as a requirement d4cf542 <Uwe L. Korn> Fix linter issues fd75660 <Korn, Uwe> Fix Sphinx build for sphinx>=1.8 9be6fbe <Korn, Uwe> Merge C++ and Python documentation
1 parent 187b98e commit 35f8a34

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+1543
-1202
lines changed

.dockerignore

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717

1818
.git
1919
docker_cache
20+
docs/_build
2021

2122
# IDE
2223
.vscode
@@ -49,7 +50,6 @@ python/dist
4950
python/*.egg-info
5051
python/*.egg
5152
python/*.pyc
52-
python/doc/_build
5353
__pycache__/
5454
*/__pycache__/
5555
*/*/__pycache__/

.gitignore

+5
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@
1515
# specific language governing permissions and limitations
1616
# under the License.
1717

18+
apache-rat-*.jar
19+
arrow-src.tar
20+
1821
# Compiled source
1922
*.a
2023
*.dll
@@ -34,7 +37,9 @@ MANIFEST
3437
*.iml
3538

3639
cpp/.idea/
40+
cpp/apidoc/xml/
3741
python/.eggs/
42+
python/doc/
3843
.vscode
3944
.idea/
4045
.pytest_cache/

.pre-commit-config.yaml

+8
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,14 @@
2121
# To run all hooks on all files use `pre-commit run -a`
2222

2323
repos:
24+
- repo: local
25+
hooks:
26+
- id: rat
27+
name: rat
28+
language: system
29+
entry: bash -c "git archive HEAD --prefix=apache-arrow/ --output=arrow-src.tar && ./dev/release/run-rat.sh arrow-src.tar"
30+
always_run: true
31+
pass_filenames: false
2432
- repo: git://github.com/pre-commit/pre-commit-hooks
2533
sha: v1.2.3
2634
hooks:

ci/conda_env_python.yml

+1
Original file line numberDiff line numberDiff line change
@@ -21,5 +21,6 @@ numpy
2121
pandas
2222
pytest
2323
python
24+
rsync
2425
setuptools
2526
setuptools_scm

ci/conda_env_sphinx.yml

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# Requirements for building the documentation
19+
breathe
20+
doxygen
21+
ipython
22+
sphinx
23+
sphinx_rtd_theme

ci/docker_build_sphinx.sh

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
#!/usr/bin/env bash
2+
#
3+
# Licensed to the Apache Software Foundation (ASF) under one or more
4+
# contributor license agreements. See the NOTICE file distributed with
5+
# this work for additional information regarding copyright ownership.
6+
# The ASF licenses this file to You under the Apache License, Version 2.0
7+
# (the "License"); you may not use this file except in compliance with
8+
# the License. You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
19+
set -ex
20+
21+
pushd /arrow/cpp/apidoc
22+
doxygen
23+
popd
24+
25+
pushd /arrow/python
26+
python setup.py build_sphinx -s ../docs/source --build-dir ../docs/_build
27+
popd
28+
29+
mkdir -p /arrow/site/asf-site/docs/latest
30+
rsync -r /arrow/docs/_build/html/ /arrow/site/asf-site/docs/latest/

ci/travis_script_python.sh

+5-6
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,7 @@ conda install -y -q pip \
6161

6262
if [ "$ARROW_TRAVIS_PYTHON_DOCS" == "1" ] && [ "$PYTHON_VERSION" == "3.6" ]; then
6363
# Install documentation dependencies
64-
conda install -y -q \
65-
ipython \
66-
numpydoc \
67-
sphinx=1.7.9 \
68-
sphinx_rtd_theme
64+
conda install -y -c conda-forge --file ci/conda_env_sphinx.yml
6965
fi
7066

7167
# ARROW-2093: PyTorch increases the size of our conda dependency stack
@@ -190,7 +186,10 @@ if [ "$ARROW_TRAVIS_COVERAGE" == "1" ]; then
190186
fi
191187

192188
if [ "$ARROW_TRAVIS_PYTHON_DOCS" == "1" ] && [ "$PYTHON_VERSION" == "3.6" ]; then
193-
cd doc
189+
pushd ../cpp/apidoc
190+
doxygen
191+
popd
192+
cd ../docs
194193
sphinx-build -q -b html -d _build/doctrees -W source _build/html
195194
fi
196195

cpp/apidoc/Doxyfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -1919,7 +1919,7 @@ MAN_LINKS = NO
19191919
# captures the structure of the code including all documentation.
19201920
# The default value is: NO.
19211921

1922-
GENERATE_XML = NO
1922+
GENERATE_XML = YES
19231923

19241924
# The XML_OUTPUT tag is used to specify where the XML pages will be put. If a
19251925
# relative path is entered the value of OUTPUT_DIRECTORY will be put in front of

cpp/apidoc/index.md

-57
Original file line numberDiff line numberDiff line change
@@ -41,60 +41,3 @@ Table of Contents
4141
* [Convert a vector of row-wise data into an Arrow table](tutorials/row_wise_conversion.md)
4242
* [Using the Plasma In-Memory Object Store](tutorials/plasma.md)
4343
* [Use Plasma to Access Tensors from C++ in Python](tutorials/tensor_to_py.md)
44-
45-
Getting Started
46-
---------------
47-
48-
The most basic structure in Arrow is an `arrow::Array`. It holds a sequence
49-
of values with known length all having the same type. It consists of the data
50-
itself and an additional bitmap that indicates if the corresponding entry of
51-
array is a null-value. Note that for array with zero null entries, we can omit
52-
this bitmap.
53-
54-
As Arrow objects are immutable, there are classes provided that should help you
55-
build these objects. To build an array of `int64_t` elements, we can use the
56-
`arrow::Int64Builder`. In the following example, we build an array of the range
57-
1 to 8 where the element that should hold the number 4 is nulled.
58-
59-
Int64Builder builder;
60-
builder.Append(1);
61-
builder.Append(2);
62-
builder.Append(3);
63-
builder.AppendNull();
64-
builder.Append(5);
65-
builder.Append(6);
66-
builder.Append(7);
67-
builder.Append(8);
68-
69-
std::shared_ptr<Array> array;
70-
builder.Finish(&array);
71-
72-
The resulting Array (which can be casted to `arrow::Int64Array` if you want
73-
to access its values) then consists of two `arrow::Buffer`. The first one is
74-
the null bitmap holding a single byte with the bits `0|0|0|0|1|0|0|0`.
75-
As we use [least-significant bit (LSB) numbering](https://en.wikipedia.org/wiki/Bit_numbering)
76-
this indicates that the fourth entry in the array is null. The second
77-
buffer is simply an `int64_t` array containing all the above values.
78-
As the fourth entry is null, the value at that position in the buffer is
79-
undefined.
80-
81-
// Cast the Array to its actual type to access its data
82-
std::shared_ptr<Int64Array> int64_array = std::static_pointer_cast<Int64Array>(array);
83-
84-
// Get the pointer to the null bitmap.
85-
const uint8_t* null_bitmap = int64_array->null_bitmap_data();
86-
87-
// Get the pointer to the actual data
88-
const int64_t* data = int64_array->raw_values();
89-
90-
In the above example, we have yet skipped explaining two things in the code.
91-
On constructing the builder, we have passed `arrow::int64()` to it. This is
92-
the type information with which the resulting array will be annotated. In
93-
this simple form, it is solely a `std::shared_ptr<arrow::Int64Type>`
94-
instantiation.
95-
96-
Furthermore, we have passed `arrow::default_memory_pool()` to the constructor.
97-
This `arrow::MemoryPool` is used for the allocations of heap memory. Besides
98-
tracking the amount of memory allocated, the allocator also ensures that the
99-
allocated memory regions are 64-byte aligned (as required by the Arrow
100-
specification).

cpp/src/arrow/array.h

+1
Original file line numberDiff line numberDiff line change
@@ -397,6 +397,7 @@ class ARROW_EXPORT PrimitiveArray : public FlatArray {
397397
const uint8_t* raw_values_;
398398
};
399399

400+
/// Concrete Array class for numeric data.
400401
template <typename TYPE>
401402
class ARROW_EXPORT NumericArray : public PrimitiveArray {
402403
public:

dev/gen_apidocs/create_documents.sh

-9
Original file line numberDiff line numberDiff line change
@@ -87,15 +87,6 @@ rsync -r doc/parquet-glib/html/ ../../site/asf-site/docs/c_glib/parquet-glib
8787
popd
8888
popd
8989

90-
# Now Python documentation can be built
91-
pushd arrow/python
92-
python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \
93-
--with-plasma --with-parquet --inplace
94-
python setup.py build_sphinx -s doc/source
95-
mkdir -p ../site/asf-site/docs/python
96-
rsync -r doc/_build/html/ ../site/asf-site/docs/python
97-
popd
98-
9990
# Make C++ documentation
10091
pushd arrow/cpp/apidoc
10192
rm -rf html/*

dev/release/rat_exclude_files.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ dev/tasks/linux-packages/debian/plasma-store-server.install
114114
dev/tasks/linux-packages/debian/rules
115115
dev/tasks/linux-packages/debian/source/format
116116
dev/tasks/linux-packages/debian/watch
117+
docs/requirements.txt
117118
go/arrow/go.sum
118119
go/arrow/Gopkg.lock
119120
go/arrow/internal/cpu/*
@@ -124,7 +125,6 @@ js/.npmignore
124125
js/closure-compiler-scripts/*
125126
python/cmake_modules
126127
python/cmake_modules/*
127-
python/doc/requirements.txt
128128
python/MANIFEST.in
129129
python/pyarrow/includes/__init__.pxd
130130
python/pyarrow/tests/__init__.py

dev/release/run-rat.sh

+6-2
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,14 @@
1818
# under the License.
1919
#
2020

21+
RAT_VERSION=0.12
22+
2123
# download apache rat
22-
curl -s https://repo1.maven.org/maven2/org/apache/rat/apache-rat/0.12/apache-rat-0.12.jar > apache-rat-0.12.jar
24+
if [ ! -f apache-rat-${RAT_VERSION}.jar ]; then
25+
curl -s https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar > apache-rat-${RAT_VERSION}.jar
26+
fi
2327

24-
RAT="java -jar apache-rat-0.12.jar -x "
28+
RAT="java -jar apache-rat-${RAT_VERSION}.jar -x "
2529

2630
RELEASE_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
2731

docker-compose.yml

+15-1
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ services:
152152
######################### Tools and Linters #################################
153153

154154
# TODO(kszucs): site
155-
# TODO(kszucs): apidoc
155+
# TODO(kszucs): {cpp,java,glib,js}-apidoc
156156

157157
lint:
158158
# Usage:
@@ -178,12 +178,26 @@ services:
178178

179179
clang-format:
180180
# Usage:
181+
# docker-compose build cpp
182+
# docker-compose build python
181183
# docker-compose build lint
182184
# docker-compose run clang-format
183185
image: arrow:lint
184186
command: arrow/dev/lint/run_clang_format.sh
185187
volumes: *ubuntu-volumes
186188

189+
docs:
190+
# Usage:
191+
# docker-compose build cpp
192+
# docker-compose build python
193+
# docker-compose build docs
194+
# docker-compose run docs
195+
image: arrow:docs
196+
build:
197+
context: .
198+
dockerfile: docs/Dockerfile
199+
volumes: *volumes
200+
187201
######################### Integration Tests #################################
188202

189203
# impala:

python/doc/.gitignore docs/.gitignore

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,4 @@
1616
# under the License.
1717

1818
_build
19-
source/generated
19+
source/python/generated
File renamed without changes.

docs/Dockerfile

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
FROM arrow:python-3.6
19+
20+
ADD ci/conda_env_sphinx.yml /arrow/ci/
21+
RUN conda install -c conda-forge \
22+
--file arrow/ci/conda_env_sphinx.yml && \
23+
conda clean --all
24+
CMD arrow/ci/docker_build_cpp.sh && \
25+
arrow/ci/docker_build_python.sh && \
26+
arrow/ci/docker_build_sphinx.sh

python/doc/Makefile docs/Makefile

File renamed without changes.
File renamed without changes.

python/doc/requirements.txt docs/requirements.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
breathe
12
ipython
23
matplotlib
34
numpydoc
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)