Skip to content

Conversation

@assignUser
Copy link
Member

No description provided.

@github-actions
Copy link

@assignUser
Copy link
Member Author

assignUser commented Oct 21, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

Failed to render template `java-jars/github.yml` with UndefinedError: 'matrix' is undefined
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/3296840704

@rok
Copy link
Member

rok commented Oct 21, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

Revision: 304216192de7a2ea30daa6300bbb0cccc06f0384

Submitted crossbow builds: ursacomputing/crossbow @ actions-f4432d2334

Task Status
java-jars Github Actions

@github-actions
Copy link

Revision: 304216192de7a2ea30daa6300bbb0cccc06f0384

Submitted crossbow builds: ursacomputing/crossbow @ actions-446751f161

Task Status
java-jars Github Actions

@assignUser
Copy link
Member Author

@github-actions crossbow submit java-jars

@github-actions
Copy link

Revision: d7b6766bf466a106c9a064fd7d06a453dde4ae32

Submitted crossbow builds: ursacomputing/crossbow @ actions-201db7e35e

Task Status
java-jars Github Actions

@rok
Copy link
Member

rok commented Oct 21, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

Revision: 8f82acefaf266220be68c44e5e25d73d2b930f9b

Submitted crossbow builds: ursacomputing/crossbow @ actions-21fd4a18f1

Task Status
java-jars Github Actions

@rok
Copy link
Member

rok commented Oct 21, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

Revision: d4e4169f6317ed472b1cde8e72cf1826fd4a1bdd

Submitted crossbow builds: ursacomputing/crossbow @ actions-3ca7d2e887

Task Status
java-jars Github Actions

@assignUser
Copy link
Member Author

@kou I was able to improve the building of the libs but both @rok and I are unclear on how to best get maven to bundle these efficently in (idealy) on build step for all archs. Could you have a look?

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we also need to prepared ${os.detected.arch} to shared library path when we load it.
I'm not familiar with Java but the following may work:

diff --git a/java/c/src/main/java/org/apache/arrow/c/jni/JniLoader.java b/java/c/src/main/java/org/apache/arrow/c/jni/JniLoader.java
index ca061937cd..42d37a3bff 100644
--- a/java/c/src/main/java/org/apache/arrow/c/jni/JniLoader.java
+++ b/java/c/src/main/java/org/apache/arrow/c/jni/JniLoader.java
@@ -33,7 +33,7 @@ import java.util.Set;
  * The JniLoader for C Data Interface API's native implementation.
  */
 public class JniLoader {
-  private static final JniLoader INSTANCE = new JniLoader(Collections.singletonList("arrow_cdata_jni"));
+  private static final JniLoader INSTANCE = new JniLoader(Collections.singletonList(System.getProperty("os.arch").toLowerCase(Locale.US) + ".arrow_cdata_jni"));
 
   public static JniLoader get() {
     return INSTANCE;

The above path is used for getResourceAsStream():

Comment on lines 81 to 85
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding java/Brewfile that includes brew "openjdk@11" and brew "sccache"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@assignUser does that reduce CI's ability to cache? If not. I think this could be a good idea as Java developers could use it as a reference for dependencies they need as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that should not affect caching 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@rok
Copy link
Member

rok commented Oct 24, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

Revision: 1525d0550797df99184e91b65e77cb470257e8b7

Submitted crossbow builds: ursacomputing/crossbow @ actions-d68fe64cda

Task Status
java-jars Github Actions

@rok
Copy link
Member

rok commented Nov 5, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

github-actions bot commented Nov 5, 2022

Revision: efcec48

Submitted crossbow builds: ursacomputing/crossbow @ actions-0c10a4c2b8

Task Status
java-jars Github Actions

@jonathanswenson
Copy link
Contributor

With the jars built from the previous build (https://github.com/ursacomputing/crossbow/releases/tag/actions-0c10a4c2b8-github-java-jars) I can successfully run the reproduction that I mentioned on my original jira ticket (the comment was about arrow-c-data, but the original ticket was for gandiva). I'll try some real world examples later (as well as the gandiva repro): https://issues.apache.org/jira/browse/ARROW-16608?focusedCommentId=17559330&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17559330

@rok
Copy link
Member

rok commented Nov 6, 2022

@jonathanswenson the new jars now build for an issue I've been having. I've tried your example:

import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.types.pojo.Field;
import org.apache.arrow.vector.types.pojo.FieldType;
import org.apache.arrow.vector.types.pojo.ArrowType;
import java.util.ArrayList;
import java.util.List;
import org.apache.arrow.c.ArrowSchema;
import org.apache.arrow.c.Data;
import org.apache.arrow.vector.types.pojo.Schema;

public class TestDataset {

  public static void main(String[] args) {
    RootAllocator allocator = new RootAllocator();
    Field field = new Field("int_field", FieldType.nullable(new ArrowType.Int(32, true)), null);

    List<Field> fieldList = new ArrayList<>();
    fieldList.add(field);

    Schema schema = new Schema(fieldList, null);
    ArrowSchema cSchema = ArrowSchema.allocateNew(allocator);
    Data.exportSchema(allocator, schema, null, cSchema);
  }
}
10:53:07.283 [main] DEBUG org.apache.arrow.memory.rounding.DefaultRoundingPolicy - -Dorg.apache.memory.allocator.maxOrder: 11
Exception in thread "main" java.lang.VerifyError: Bad type on operand stack
Exception Details:
  Location:
    org/apache/arrow/vector/types/pojo/ArrowType.getInt(Lorg/apache/arrow/flatbuf/Field;)Lorg/apache/arrow/vector/types/pojo/ArrowType$Int; @8: invokevirtual
  Reason:
    Type 'org/apache/arrow/flatbuf/Int' (current frame, stack[1]) is not assignable to 'com/google/flatbuffers/Table'
  Current Frame:
    bci: @8
    flags: { }
    locals: { 'org/apache/arrow/flatbuf/Field' }
    stack: { 'org/apache/arrow/flatbuf/Field', 'org/apache/arrow/flatbuf/Int' }
  Bytecode:
    0000000: 2abb 0026 59b7 0027 b600 05c0 0026 4cbb
    0000010: 002a 592b b600 282b b600 29b7 002b b0  

	at TestDataset.main(TestDataset.java:15)

Which I think is an unrelated issue?
I downloaded and imported:

arrow-algorithm-11.0.0-SNAPSHOT.jar
arrow-c-data-11.0.0-SNAPSHOT.jar
arrow-dataset-11.0.0-SNAPSHOT.jar
arrow-memory-core-11.0.0-SNAPSHOT.jar
arrow-memory-netty-11.0.0-SNAPSHOT.jar
arrow-tools-11.0.0-SNAPSHOT-jar-with-dependencies.jar
arrow-tools-11.0.0-SNAPSHOT.jar
arrow-vector-11.0.0-SNAPSHOT-shade-format-flatbuffers.jar
arrow-vector-11.0.0-SNAPSHOT.jar

But I don't think you need all of these.

@rok rok requested a review from davisusanibar November 6, 2022 12:23
@rok
Copy link
Member

rok commented Nov 6, 2022

@jonathanswenson can you provide a repro of persisting issues if any?

@davisusanibar
Copy link
Contributor

Please confirm if local build are also working -Pgenerate-libs-cdata-all-os, -Pgenerate-libs-jni-macos-linux, -Pgenerate-libs-jni-windows and also update Building Java Modules documentation

@rok
Copy link
Member

rok commented Nov 7, 2022

Please confirm if local build are also working -Pgenerate-libs-cdata-all-os, -Pgenerate-libs-jni-macos-linux, -Pgenerate-libs-jni-windows and also update Building Java Modules documentation

Good point! I've added a commit to enable it and can verify it on aarch_64 (M1). Would appreciate tests on other architectures.

@rok
Copy link
Member

rok commented Nov 7, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

github-actions bot commented Nov 7, 2022

Revision: c2ba634

Submitted crossbow builds: ursacomputing/crossbow @ actions-3965a3e907

Task Status
java-jars Github Actions

@jonathanswenson
Copy link
Contributor

@rok no issues that I can tell -- the built jars have worked for everything I've tried on my M1 mac.

@kou
Copy link
Member

kou commented Nov 7, 2022

Can we merge this?

Copy link
Contributor

@davisusanibar davisusanibar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@davisusanibar
Copy link
Contributor

LGTM, thanks

Artifacts: https://github.com/ursacomputing/crossbow/releases/tag/actions-0c10a4c2b8-github-java-jars

$ jar -tf ~/arrow-c-data-11.0.0-SNAPSHOT.jar 
....
aarch_64/libarrow_cdata_jni.dylib
x86_64/libarrow_cdata_jni.dylib
x86_64/libarrow_cdata_jni.so
x86_64/arrow_cdata_jni.dll

@rok
Copy link
Member

rok commented Nov 7, 2022

I think we can merge!

On M1 I was able to run:

cd arrow/java
mvn clean install
mvn generate-resources -Pgenerate-libs-cdata-all-os -N
mvn -Darrow.c.jni.dist.dir=/Users/rok/Documents/repos/arrow/java-dist/lib/ -Parrow-c-data clean install -rf :arrow-c-data

I'm still struggling with mvn generate-resources -Pgenerate-libs-jni-macos-linux -N but I think that's related to my cmake situation.

Feel free to merge @kou!

@kou
Copy link
Member

kou commented Nov 8, 2022

I'm still struggling with mvn generate-resources -Pgenerate-libs-jni-macos-linux -N but I think that's related to my cmake situation.

Could you open a Jira issue for it with full log? If it's a problem of your environment, we can just close the issue as "Not A Problem".

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit 8776295 into apache:master Nov 8, 2022
@ursabot
Copy link

ursabot commented Nov 8, 2022

Benchmark runs are scheduled for baseline = 98943d9 and contender = 8776295. 8776295 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.47% ⬆️0.03%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.42% ⬆️0.25%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 8776295f ec2-t3-xlarge-us-east-2
[Finished] 8776295f test-mac-arm
[Finished] 8776295f ursa-i9-9960x
[Finished] 8776295f ursa-thinkcentre-m75q
[Finished] 98943d90 ec2-t3-xlarge-us-east-2
[Finished] 98943d90 test-mac-arm
[Finished] 98943d90 ursa-i9-9960x
[Finished] 98943d90 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@rok
Copy link
Member

rok commented Nov 8, 2022

I'm still having issues with mvn generate-resources -Pgenerate-libs-jni-macos-linux -N so I opened a follow-up jira.

Yicong-Huang added a commit to apache/texera that referenced this pull request May 5, 2023
This PR bumps Apache Arrow version from 10.0.0 to 11.0.0.

Main changes related to PyAmber:

## Java/Scala side:

- Distribute Apple M1 compatible JNI libraries via mavencentral
([#14472](apache/arrow#14472)).
- Improve performance by short-circuiting null checks when comparing non
null field types ([#15106](apache/arrow#15106)).
- Extend Table copy functionality, and support returning copies of
individual vectors
([#14389](apache/arrow#14389)).
- Several enhancements to dictionary encoding
([#14891](apache/arrow#14891),
([#14902](apache/arrow#14902),
([#14874](apache/arrow#14874)).
- Extend Table to support additional vector types
([#14573](apache/arrow#14573)).
- Enhance and simplify handling of allocation management by integrating
C Data into allocator hierarchy
([#14506](apache/arrow#14506)).

## Python side:
- PyArrow now requires pandas >= 1.0
([ARROW-18173](https://issues.apache.org/jira/browse/ARROW-18173)).
- Added support for the [DataFrame Interchange
Protocol](https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html)
for pyarrow.Table
([GH-33346](apache/arrow#33346)).
- Support for custom metadata of record batches in the IPC read and
write APIs
([ARROW-16430](https://issues.apache.org/jira/browse/ARROW-16430)).
- The Time32Scalar, Time64Scalar, Date32Scalar and Date64Scalar classes
got a .value attribute to access the underlying integer value, similar
to the other date-time related scalars
([ARROW-18264](https://issues.apache.org/jira/browse/ARROW-18264)).
- Casting to string is now supported for duration
([ARROW-15822](https://issues.apache.org/jira/browse/ARROW-15822)) and
decimal
([ARROW-17458](https://issues.apache.org/jira/browse/ARROW-17458))
types, which also means those can now be written to CSV.

## Issues fixed:
- Now Do_action (from Python server back to Java Client) is returning a
stream of results properly, and it alerts when the results are not fully
consumed by the client. Such results will be used to send the flow
control credits back from the Python side. We limit the results to be
exact 1 for now, although it can be a stream.
- Fix a bug in the Python proxy server, when unregistered action is
invoked, it should not parse and return the results.
lriggs pushed a commit to lriggs-arrow-org/arrow that referenced this pull request May 10, 2023
…mavencentral (apache#14472)

Lead-authored-by: Rok Mihevc <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Jacob Wujciak-Jens <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…mavencentral (apache#14472)

Lead-authored-by: Rok Mihevc <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Jacob Wujciak-Jens <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants