Skip to content

Conversation

@lwhite1
Copy link
Contributor

@lwhite1 lwhite1 commented Oct 12, 2022

Copy methods for the table and for individual vectors are provided with tests

@lwhite1
Copy link
Contributor Author

lwhite1 commented Oct 12, 2022

@davisusanibar If you wouldn't mind providing a review it would be appreciated.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@lwhite1
Copy link
Contributor Author

lwhite1 commented Oct 13, 2022

@lidavidm Are you familiar with the error shown in the Windows Flight tests:

Warning:  Tests run: 4, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 0 s - in org.apache.arrow.flight.TestBackPressure
[2069](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2070)
[INFO] Running org.apache.arrow.flight.TestBasicOperation
[2070](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2071)
[INFO] Running org.apache.arrow.algorithm.sort.TestVariableWidthSorting
[2071](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2072)
[INFO] Tests run: 49, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.97 s - in org.apache.arrow.algorithm.sort.TestVariableWidthSorting
[2072](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2073)
63 6F 6F 6C 20 74 68 69 6E 67 
[2073](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2074)
get
[2074](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2075)
put
[2075](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2076)
hello
[2076](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2077)
Schema<>
[2077](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2078)
Error:  Tests run: 19, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.808 s <<< FAILURE! - in org.apache.arrow.flight.TestBasicOperation
[2078](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2079)
Error:  org.apache.arrow.flight.TestBasicOperation.getStreamLargeBatch  Time elapsed: 2.814 s  <<< ERROR!
[2079](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2080)
java.lang.IllegalStateException: 
[2080](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2081)
Memory was leaked by query. Memory leaked: (134217728)
[2081](https://github.com/apache/arrow/actions/runs/3236942456/jobs/5303398561#step:5:2082)

@lidavidm
Copy link
Member

This has been known for a while, but I've never been able to reproduce it.

@lidavidm
Copy link
Member

The resolution would probably be to figure out how to enable allocator logging in CI so that we can properly debug this

@lwhite1
Copy link
Contributor Author

lwhite1 commented Oct 13, 2022

The resolution would probably be to figure out how to enable allocator logging in CI so that we can properly debug this

OK. thanks

@lidavidm
Copy link
Member

See ARROW-18034 / ARROW-18035

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one typo spotted in a doc comment

*
* @param columnName The name of the vector to copy
* @return A copy of the Vector with the given name
* @throws IllegalArgumentException if the name is not the name of a vector in the table.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the exception class is different from what's actually thrown

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix the exception class issue. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +193 to +197
assertEquals(2, copy.getValueCount());
assertEquals(0, copy.getNullCount());
for (int i = 0; i < t.getRowCount(); i++) {
assertEquals(original.getObject(i), copy.getObject(i));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessarily an issue for this PR, but there is a VectorValueComparator and perhaps it would be good to integrate it as FieldVector#equals if it isn't already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. Thank you

@lwhite1
Copy link
Contributor Author

lwhite1 commented Oct 24, 2022

I modified the documentation to reflect the correct Exception (It should be IllegalArgumentException.) This was mentioned incorrectly in many places in the Row class so the documentation was corrected there as well. Only documentation was touched in the latest commit

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@lidavidm
Copy link
Member

Seems tests need to be realigned too?

Error:    BaseTableTest.getVectorCopyByIndex:199 Unexpected exception type thrown, expected: <java.lang.IllegalStateException> but was: <java.lang.IllegalArgumentException>
Error:    BaseTableTest.getVectorCopyByName:219 Unexpected exception type thrown, expected: <java.lang.IllegalStateException> but was: <java.lang.IllegalArgumentException>
Error:    BaseTableTest.testGetVector:178 Unexpected exception type thrown, expected: <java.lang.IllegalStateException> but was: <java.lang.IllegalArgumentException>
Error:    RowTest.testNameNotFound:172 Unexpected exception type thrown, expected: <java.lang.IllegalStateException> but was: <java.lang.IllegalArgumentException>

* @return A copy of the Vector with the given name
* @throws IllegalArgumentException if the name is not the name of a vector in the table.
*/
public FieldVector getVectorCopy(String columnName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a doubt, Is there some base/abstract object that Table and VectorShemaRoot share?

For example, this method could be helpful in both sides Tables and VectorSchemaRoot

If by design this was created independently please let me know to consider that at the moment to read *.table package

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a doubt, Is there some base/abstract object that Table and VectorShemaRoot share?

For example, this method could be helpful in both sides Tables and VectorSchemaRoot

There's no abstract class as VSR doesn't inherit from anything. I have considered adding an interface at some point that they both could implement to make it easier to swap one for the other.

If by design this was created independently please let me know to consider that at the moment to read *.table package

I'm not sure I understand this part of the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no abstract class as VSR doesn't inherit from anything. I have considered adding an interface at some point that they both could implement to make it easier to swap one for the other.

Ok, thank you.

I'm not sure I understand this part of the comment.

This is related with the 1st question that initially VSR and Table are growing independently and at some point they both could implement the same contract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks.
Given the possibility of someday having a shared interface, I have tried to write similar methods using the same method signature as much as possible.

@lidavidm lidavidm merged commit 0e94da9 into apache:master Oct 25, 2022
@ursabot
Copy link

ursabot commented Oct 28, 2022

Benchmark runs are scheduled for baseline = ea32825 and contender = 0e94da9. 0e94da9 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Failed] ec2-t3-xlarge-us-east-2
[Failed] test-mac-arm
[Finished ⬇️1.36% ⬆️0.0%] ursa-i9-9960x
[Failed] ursa-thinkcentre-m75q
Buildkite builds:
[Failed] 0e94da90 ec2-t3-xlarge-us-east-2
[Failed] 0e94da90 test-mac-arm
[Finished] 0e94da90 ursa-i9-9960x
[Failed] 0e94da90 ursa-thinkcentre-m75q
[Failed] ea32825e ec2-t3-xlarge-us-east-2
[Failed] ea32825e test-mac-arm
[Finished] ea32825e ursa-i9-9960x
[Failed] ea32825e ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Yicong-Huang added a commit to apache/texera that referenced this pull request May 5, 2023
This PR bumps Apache Arrow version from 10.0.0 to 11.0.0.

Main changes related to PyAmber:

## Java/Scala side:

- Distribute Apple M1 compatible JNI libraries via mavencentral
([#14472](apache/arrow#14472)).
- Improve performance by short-circuiting null checks when comparing non
null field types ([#15106](apache/arrow#15106)).
- Extend Table copy functionality, and support returning copies of
individual vectors
([#14389](apache/arrow#14389)).
- Several enhancements to dictionary encoding
([#14891](apache/arrow#14891),
([#14902](apache/arrow#14902),
([#14874](apache/arrow#14874)).
- Extend Table to support additional vector types
([#14573](apache/arrow#14573)).
- Enhance and simplify handling of allocation management by integrating
C Data into allocator hierarchy
([#14506](apache/arrow#14506)).

## Python side:
- PyArrow now requires pandas >= 1.0
([ARROW-18173](https://issues.apache.org/jira/browse/ARROW-18173)).
- Added support for the [DataFrame Interchange
Protocol](https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html)
for pyarrow.Table
([GH-33346](apache/arrow#33346)).
- Support for custom metadata of record batches in the IPC read and
write APIs
([ARROW-16430](https://issues.apache.org/jira/browse/ARROW-16430)).
- The Time32Scalar, Time64Scalar, Date32Scalar and Date64Scalar classes
got a .value attribute to access the underlying integer value, similar
to the other date-time related scalars
([ARROW-18264](https://issues.apache.org/jira/browse/ARROW-18264)).
- Casting to string is now supported for duration
([ARROW-15822](https://issues.apache.org/jira/browse/ARROW-15822)) and
decimal
([ARROW-17458](https://issues.apache.org/jira/browse/ARROW-17458))
types, which also means those can now be written to CSV.

## Issues fixed:
- Now Do_action (from Python server back to Java Client) is returning a
stream of results properly, and it alerts when the results are not fully
consumed by the client. Such results will be used to send the flow
control credits back from the Python side. We limit the results to be
exact 1 for now, although it can be a stream.
- Fix a bug in the Python proxy server, when unregistered action is
invoked, it should not parse and return the results.
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…pache#14389)

Copy methods for the table and for individual vectors are provided with tests

Authored-by: Larry White <[email protected]>
Signed-off-by: David Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants