forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
Use Symbols for Row properties, Object.create() without a PropertyMap instead of constructor calls #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
trxcllnt
wants to merge
2
commits into
TheNeuralBit:proxy-bench
Choose a base branch
from
trxcllnt:js/proxy-bench-update
base: proxy-bench
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
trxcllnt
referenced
this pull request
in vega/vega-loader-arrow
Feb 11, 2019
TheNeuralBit
pushed a commit
that referenced
this pull request
Mar 21, 2019
I'm sure I'll need some guidance on this one from @sunchao or @liurenjie1024 but I am keen to get parquet support added for primitive types so that I can actually use DataFusion and Arrow in production at some point. Author: Andy Grove <[email protected]> Author: Neville Dipale <[email protected]> Author: Andy Grove <[email protected]> Closes apache#3851 from andygrove/ARROW-4466 and squashes the following commits: 3158529 <Andy Grove> add test for reading small batches 549c829 <Andy Grove> Remove hard-coded batch size, fix nits 8d2df06 <Andy Grove> move schema projection function from arrow into datafusion 204db83 <Andy Grove> fix timestamp nano issue 73aa934 <Andy Grove> Remove println from test 25d34ac <Andy Grove> Make INT32/64/96 handling consistent with C++ implementation 9b1308f <Andy Grove> clean up handling of INT96 and DATE/TIME/TIMESTAMP types in schema converter 1ec815b <Andy Grove> Clean up imports 023dc25 <Andy Grove> Merge pull request #2 from nevi-me/ARROW-4466 02b2ed3 <Neville Dipale> fix int96 conversion to read timestamps correctly 2aeea24 <Andy Grove> remove println from tests 9d3047a <Andy Grove> code cleanup 639e13e <Andy Grove> null handling for int96 1503855 <Andy Grove> handle nulls for binary data 80cf303 <Andy Grove> add date support 5a3368c <Andy Grove> Remove unnecessary slice, fix null handling 306d07a <Neville Dipale> fmt 3c711a5 <Neville Dipale> immediately allocate vec e6cbbaa <Neville Dipale> replace read_column! macro with generic 607a29f <Andy Grove> return result if there are null values e8aa784 <Andy Grove> revert temp debug change to error messages 6457c36 <Andy Grove> use parquet::reader::schema::parquet_to_arrow_schema c56510e <Andy Grove> projection takes slice instead of vec 7e1a98f <Andy Grove> remove println and unwrap dddb7d7 <Andy Grove> update to use partition-aware changes from master 157512e <Andy Grove> Remove invalid TODO comment debb2fb <Andy Grove> code cleanup 6c3b7e2 <Andy Grove> add support for all primitive parquet types b4981ed <Andy Grove> implement more parquet column types and tests 5ce3086 <Andy Grove> revert to columnar reads c3f71d7 <Andy Grove> add integration test aea9f8a <Andy Grove> convert to use row iter f46e6f7 <Andy Grove> save eaddafb <Andy Grove> save 322fc87 <Andy Grove> add test for reading strings from parquet 3a412b1 <Andy Grove> first parquet test passes ff3e5b7 <Andy Grove> test 10710a2 <Andy Grove> Parquet datasource
TheNeuralBit
pushed a commit
that referenced
this pull request
Jan 22, 2020
This updates the language in `install_arrow()` to follow the README revision that will land in https://github.com/apache/arrow/pull/4948/files#diff-563b2cb2c8c2d51b2ff6b177e2d84286R33. The [Jira ticket](https://issues.apache.org/jira/browse/ARROW-6142) requested three things; this is `#2` in the list. On `#1`, I defer to the C++ installation docs, which are already included in the install_arrow message, rather than duplicating content here. `#3` is out of scope. Closes apache#5027 from nealrichardson/no-ppa and squashes the following commits: 80b142e <Neal Richardson> s/arrow/Arrow/ 44c9659 <Neal Richardson> Tweak language again 36cfe28 <Neal Richardson> Further linux install revisions 79bd7e0 <Neal Richardson> One more PPurge 63f75bd <Neal Richardson> Revise install_arrow instructions for Linux Authored-by: Neal Richardson <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
TheNeuralBit
pushed a commit
that referenced
this pull request
May 12, 2025
…n timezone (apache#45051) ### Rationale for this change If the timezone database is present on the system, but does not contain a timezone referenced in a ORC file, the ORC reader will crash with an uncaught C++ exception. This can happen for example on Ubuntu 24.04 where some timezone aliases have been removed from the main `tzdata` package to a `tzdata-legacy` package. If `tzdata-legacy` is not installed, trying to read a ORC file that references e.g. the "US/Pacific" timezone would crash. Here is a backtrace excerpt: ``` apache#12 0x00007f1a3ce23a55 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6 apache#13 0x00007f1a3ce39391 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6 apache#14 0x00007f1a3f4accc4 in orc::loadTZDB(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#15 0x00007f1a3f4ad392 in std::call_once<orc::LazyTimezone::getImpl() const::{lambda()#1}>(std::once_flag&, orc::LazyTimezone::getImpl() const::{lambda()#1}&&)::{lambda()#2}::_FUN() () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#16 0x00007f1a4298bec3 in __pthread_once_slow (once_control=0xa5ca7c8, init_routine=0x7f1a3ce69420 <__once_proxy>) at ./nptl/pthread_once.c:116 apache#17 0x00007f1a3f4a9ad0 in orc::LazyTimezone::getEpoch() const () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#18 0x00007f1a3f4e76b1 in orc::TimestampColumnReader::TimestampColumnReader(orc::Type const&, orc::StripeStreams&, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#19 0x00007f1a3f4e84ad in orc::buildReader(orc::Type const&, orc::StripeStreams&, bool, bool, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#20 0x00007f1a3f4e8dd7 in orc::StructColumnReader::StructColumnReader(orc::Type const&, orc::StripeStreams&, bool, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#21 0x00007f1a3f4e8532 in orc::buildReader(orc::Type const&, orc::StripeStreams&, bool, bool, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#22 0x00007f1a3f4925e9 in orc::RowReaderImpl::startNextStripe() () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#23 0x00007f1a3f492c9d in orc::RowReaderImpl::next(orc::ColumnVectorBatch&) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 apache#24 0x00007f1a3e6b251f in arrow::adapters::orc::ORCFileReader::Impl::ReadBatch(orc::RowReaderOptions const&, std::shared_ptr<arrow::Schema> const&, long) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 ``` ### What changes are included in this PR? Catch C++ exceptions when iterating ORC batches instead of letting them slip through. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#40633 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Current version of proxy-bench:
This PR: