Skip to content

Conversation

@kou
Copy link
Member

@kou kou commented Jun 22, 2023

Rationale for this change

Our build requires many disk space.

What changes are included in this PR?

Remove unused files.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@kou kou requested review from assignUser and raulcd as code owners June 22, 2023 01:02
@kou
Copy link
Member Author

kou commented Jun 22, 2023

@github-actions crossbow submit preview-docs -g linux

@github-actions
Copy link

⚠️ GitHub issue #36200 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Jun 22, 2023
@github-actions

This comment was marked as outdated.

@kou kou force-pushed the ci-docs-space branch from 93273d5 to bcf5d0b Compare June 22, 2023 02:39
@kou
Copy link
Member Author

kou commented Jun 22, 2023

@github-actions crossbow submit preview-docs

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Jun 22, 2023

@github-actions crossbow submit preview-docs

@github-actions

This comment was marked as outdated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why twice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's garbage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add some kind of macro for this step instead of repeating it in two different files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But the debug option approach is better.

@pitrou
Copy link
Member

pitrou commented Jun 22, 2023

@kou The C++ build directory takes more than 8GB in this build, which is insane (partly due to building bundled gRPC and google-cloud-cpp with static libraries).

This can be trimmed down significantly by reducing the size of debug information (which isn't very useful on CI anyway). If I do:

export ARROW_C_FLAGS_DEBUG=-g1
export ARROW_CXX_FLAGS_DEBUG=-g1

then the size of the build directory goes down from 8GB to 5GB...

We should probably do so on all gcc-based builds.

Copy link
Member Author

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! I didn't notice that the big size was caused by debug option. I'll use the approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's garbage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But the debug option approach is better.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jun 22, 2023
@pitrou
Copy link
Member

pitrou commented Jun 22, 2023

Wow! I didn't notice that the big size was caused by debug option. I'll use the approach.

Can we find a way to do that on all debug CI builds (except if MSVC is used, probably)?

@pitrou
Copy link
Member

pitrou commented Jun 22, 2023

Also, it might make compilation caching more efficient (since the cached files may be smaller)...

@github-actions github-actions bot added Component: C++ awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 23, 2023
@kou
Copy link
Member Author

kou commented Jun 23, 2023

Can we find a way to do that on all debug CI builds (except if MSVC is used, probably)?

We can detect whether on GitHub Actions or not by GITHUB_ACTIONS=true. So we can use -g1 by default on GItHub Actions.

But... we can't use -g1 for GDB plugin tests...
https://github.com/apache/arrow/actions/runs/5353229013/jobs/9708905231?pr=36230#step:6:6529

_______________________________ test_arrays_heap _______________________________

gdb_arrow = <pyarrow.tests.test_gdb.GdbSession object at 0x7f460d9a1910>

    def test_arrays_heap(gdb_arrow):
        # Null
>       check_heap_repr(
            gdb_arrow, "heap_null_array",
            "arrow::NullArray of length 2, offset 0, null count 2")

opt/conda/envs/arrow/lib/python3.9/site-packages/pyarrow/tests/test_gdb.py:770: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

gdb = <pyarrow.tests.test_gdb.GdbSession object at 0x7f460d9a1910>
expr = 'heap_null_array'
expected = 'arrow::NullArray of length 2, offset 0, null count 2'

    def check_heap_repr(gdb, expr, expected):
        """
        Check printing a heap-located value, given its address.
        """
        s = gdb.print_value(f"*{expr}")
        # GDB may prefix the value with an address or type specification
        if s != expected:
>           assert s.endswith(f" {expected}")
E           AssertionError: assert False
E            +  where False = <built-in method endswith of str object at 0x55e0685ce330>(' arrow::NullArray of length 2, offset 0, null count 2')
E            +    where <built-in method endswith of str object at 0x55e0685ce330> = '(std::__shared_ptr_access<arrow::Array, (__gnu_cxx::_Lock_policy)2, false, false>::element_type &) @0x55d33eb43a70: {...ields>}, _M_ptr = 0x55d33eb54bc0, _M_refcount = {_M_pi = 0x55d33eb54bb0}}, <No data fields>}, null_bitmap_data_ = 0x0}'.endswith

opt/conda/envs/arrow/lib/python3.9/site-packages/pyarrow/tests/test_gdb.py:245: AssertionError
----------------------------- Captured stdout call -----------------------------
p *heap_null_array
$36 = (std::__shared_ptr_access<arrow::Array, (__gnu_cxx::_Lock_policy)2, false, false>::element_type &) @0x55d33eb43a70: {_vptr.Array = 0x7fa4501a6ff8 <vtable for arrow::NullArray+16>, data_ = {<std::__shared_ptr<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<arrow::ArrayData, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x55d33eb54bc0, _M_refcount = {_M_pi = 0x55d33eb54bb0}}, <No data fields>}, null_bitmap_data_ = 0x0}
(gdb) 
----------------------------- Captured stderr call -----------------------------
Python Exception <class 'gdb.error'>: There is no member named id_.

Hmm. We may need to use -g1 only for bundled dependencies...

@kou
Copy link
Member Author

kou commented Jun 23, 2023

Or we just don't use -g1 for Python jobs. Our GDB plugin's tests are only run in Python jobs.

@kou
Copy link
Member Author

kou commented Jun 25, 2023

@github-actions crossbow submit -g linux preview-docs

@kou kou force-pushed the ci-docs-space branch from 08b9ffe to 3c78e39 Compare June 25, 2023 05:42
@github-actions
Copy link

Revision: 3c78e3929a785010354c196d02c8ce7987e66b4a

Submitted crossbow builds: ursacomputing/crossbow @ actions-1328ed330f

Task Status
almalinux-8-amd64 Github Actions
almalinux-8-arm64 Github Actions
almalinux-9-amd64 Github Actions
almalinux-9-arm64 Github Actions
amazon-linux-2-amd64 Github Actions
amazon-linux-2-arm64 Github Actions
amazon-linux-2023-amd64 Github Actions
amazon-linux-2023-arm64 Github Actions
centos-7-amd64 Github Actions
centos-8-stream-amd64 Github Actions
centos-8-stream-arm64 Github Actions
centos-9-stream-amd64 Github Actions
centos-9-stream-arm64 Github Actions
debian-bookworm-amd64 Github Actions
debian-bookworm-arm64 Github Actions
debian-bullseye-amd64 Github Actions
debian-bullseye-arm64 Github Actions
preview-docs Github Actions
ubuntu-focal-amd64 Github Actions
ubuntu-focal-arm64 Github Actions
ubuntu-jammy-amd64 Github Actions
ubuntu-jammy-arm64 Github Actions
ubuntu-lunar-amd64 Github Actions
ubuntu-lunar-arm64 Github Actions

@kou kou force-pushed the ci-docs-space branch from 3c78e39 to 878d615 Compare June 26, 2023 06:06
@kou
Copy link
Member Author

kou commented Jun 26, 2023

The preview-docs job failure isn't "No space left on device".
It's a R document generation failure.
@thisisnic Could you check this failure?

https://github.com/ursacomputing/crossbow/actions/runs/5368074498/jobs/9738676612#step:6:10165

-- Building function reference -------------------------------------------------
Error: 
! in callr subprocess.
Caused by error in `map2(.x, vec_index(.x), .f, ...)`:
! In index: 4.
---
Standard output:
== Building pkgdown site =======================================================
Reading from: '/arrow/r'
Writing to:   '/arrow/r/docs'
-- Initialising site -----------------------------------------------------------
Copying '../../usr/local/lib/R/site-library/pkgdown/BS5/assets/link.svg' to 'link.svg'
Copying '../../usr/local/lib/R/site-library/pkgdown/BS5/assets/pkgdown.js' to 'pkgdown.js'
Copying 'pkgdown/extra.js' to 'extra.js'
Copying 'pkgdown/assets/versions.html' to 'versions.html'
Copying 'pkgdown/assets/versions.json' to 'versions.json'
Copying 'pkgdown/favicon/apple-touch-icon-120x120.png' to 'apple-touch-icon-120x120.png'
Copying 'pkgdown/favicon/apple-touch-icon-152x152.png' to 'apple-touch-icon-152x152.png'
Copying 'pkgdown/favicon/apple-touch-icon-180x180.png' to 'apple-touch-icon-180x180.png'
Copying 'pkgdown/favicon/apple-touch-icon-60x60.png' to 'apple-touch-icon-60x60.png'
Copying 'pkgdown/favicon/apple-touch-icon-76x76.png' to 'apple-touch-icon-76x76.png'
Copying 'pkgdown/favicon/apple-touch-icon.png' to 'apple-touch-icon.png'
Copying 'pkgdown/favicon/favicon-16x16.png' to 'favicon-16x16.png'
Copying 'pkgdown/favicon/favicon-32x32.png' to 'favicon-32x32.png'
Copying 'pkgdown/favicon/favicon.ico' to 'favicon.ico'
-- Building home ---------------------------------------------------------------
Writing 'authors.html'
Reading 'PACKAGING.md'
Writing 'PACKAGING.html'
Reading 'STYLE.md'
Writing 'STYLE.html'
Writing '404.html'
-- Building function reference -------------------------------------------------
---
Backtrace:
1. pkgdown::build_site(install = FALSE)
2. pkgdown:::build_site_external(pkg = pkg, examples = examples, run_dont_run = run_d...
3. callr::r(function(..., cli_colors, pkgdown_internet) { ...
4. callr:::get_result(output = out, options)
5. callr:::throw(callr_remote_error(remerr, output), parent = fix_msg(remerr[[3]]))
---
Subprocess backtrace:
 1. pkgdown::build_site(...)
 2. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_dont_run = run_dont...
 3. pkgdown::build_reference(pkg, lazy = lazy, examples = examples, run_dont_run = ru...
 4. pkgdown::build_reference_index(pkg)
 5. pkgdown::render_page(pkg, "reference-index", data = data_reference_index(pkg), ...
 6. pkgdown:::render_page_html(pkg, name = name, data = data, depth = depth)
 7. utils::modifyList(data_template(pkg, depth = depth), data)
 8. base::stopifnot(is.list(x), is.list(val))
 9. pkgdown:::data_reference_index(pkg)
10. meta %>% purrr::imap(data_reference_index_rows, pkg = pkg) %>% ...
11. base::unlist(., recursive = FALSE)
12. purrr::compact(.)
13. purrr::discard(.x, function(x) is_empty(.f(x)))
14. purrr:::where_if(.x, .p, ...)
15. purrr:::map_(.x, .p, ..., .type = "logical", .purrr_error_call = .purrr_error_call)
16. purrr:::vctrs_vec_compat(.x, .purrr_user_env)
17. purrr::imap(., data_reference_index_rows, pkg = pkg)
18. purrr::map2(.x, vec_index(.x), .f, ...)
19. purrr:::map2_("list", .x, .y, .f, ..., .progress = .progress)
20. purrr:::with_indexed_errors(i = i, names = names, error_call = .purrr_error_call...
21. base::withCallingHandlers(expr, error = function(cnd) { ...
22. purrr:::call_with_cleanup(map2_impl, environment(), .type, .progress, ...
23. local .f(.x[[i]], .y[[i]], ...)
24. pkgdown:::section_topics(section$contents, pkg$topics, pkg$src_path)
25. topics[select_topics(match_strings, topics), , ]
26. `[.tbl_df`(topics, select_topics(match_strings, topics), , )
27. pkgdown:::select_topics(match_strings, topics)
28. purrr::map(match_strings, match_eval, env = match_env(topics))
29. purrr:::map_("list", .x, .f, ..., .progress = .progress)
30. purrr:::with_indexed_errors(i = i, names = names, error_call = .purrr_error_call...
31. base::withCallingHandlers(expr, error = function(cnd) { ...
32. purrr:::call_with_cleanup(map_impl, environment(), .type, .progress, ...
33. local .f(.x[[i]], ...)
34. base::tryCatch(eval(expr, env), error = function(e) { ...
35. base::tryCatchList(expr, classes, parentenv, handlers)
36. base::tryCatchOne(expr, names, parentenv, handlers[[1L]])
37. value[[3L]](cond)
38. pkgdown:::topic_must("be a known selector function", string, parent = e)
39. rlang::abort(c(paste0("In '_pkgdown.yml', topic must ", message), x = paste0("N...
40. | rlang:::signal_abort(cnd, .file)
41. | base::signalCondition(cnd)
42. (function (cnd) ...
43. cli::cli_abort(message, location = i, name = name, parent = cnd, ...
44. | rlang::abort(message, ..., call = call, use_cli_format = TRUE, ...
45. | rlang:::signal_abort(cnd, .file)
46. | base::signalCondition(cnd)
47. (function (cnd) ...
48. cli::cli_abort(message, location = i, name = name, parent = cnd, ...
49. | rlang::abort(message, ..., call = call, use_cli_format = TRUE, ...
50. | rlang:::signal_abort(cnd, .file)
51. | base::signalCondition(cnd)
52. global (function (e) ...
Execution halted
1

@thisisnic
Copy link
Member

@kou Having issues trying to add fix commit to your branch; here's a PR: kou#13

@kou
Copy link
Member Author

kou commented Jun 26, 2023

Thanks!
(You can push to this branch directly. :-)

@kou
Copy link
Member Author

kou commented Jun 26, 2023

@github-actions crossbow submit preview-docs

@github-actions
Copy link

Revision: 60bd46610a1732ce838ab91b0a413dcd0a1b31d8

Submitted crossbow builds: ursacomputing/crossbow @ actions-8c79fd8f2e

Task Status
preview-docs Github Actions

@kou
Copy link
Member Author

kou commented Jun 26, 2023

@thisisnic Sorry. Could you also check this?

https://github.com/apache/arrow/actions/runs/5374927859/jobs/9750784797?pr=36230#step:4:9

Error! Scalar-class
schema-class missing from ./r/_pkgdown.yml

(You can push a fix to this branch directly.)

@thisisnic
Copy link
Member

thisisnic commented Jun 26, 2023

@kou The failing step is due to a technicality on how we check for missing sections in the doc. Since we implemented the check in 2021, the pkgdown package now already does this check and their method is better than the one I implemented for us to do in CI. I've opened #36300 to remove it, so once that's passed CI and merged, you'll need to rebase from that. [Edit: merged now]

@kou kou force-pushed the ci-docs-space branch from 60bd466 to e0af2b9 Compare June 26, 2023 21:33
@kou
Copy link
Member Author

kou commented Jun 26, 2023

Thanks! Rebased.

@kou kou force-pushed the ci-docs-space branch from e0af2b9 to ec45553 Compare June 27, 2023 01:43
@kou kou mentioned this pull request Jun 28, 2023
@kou
Copy link
Member Author

kou commented Jun 28, 2023

The "R / AMD64 Ubuntu 20.04 R 4.2 Force-Tests true" failure is caused by #36346. So I want to merge this.

If nobody objects it, I'll merge this tomorrow.

@kou kou merged commit 63b8091 into apache:main Jun 29, 2023
@kou kou deleted the ci-docs-space branch June 29, 2023 03:53
@kou kou removed the awaiting change review Awaiting change review label Jun 29, 2023
@kou
Copy link
Member Author

kou commented Jun 29, 2023

Merged.

@conbench-apache-arrow
Copy link

Conbench analyzed the 5 benchmark runs on commit 63b8091d.

There were 7 benchmark results indicating a performance regression:

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI][Docs] Complete Documentation builds fail with No space left on device

3 participants