Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,7 @@ private void throwUnsupportedException(int requiredCapacity, Throwable cause) {
* After writing array elements to the child column vector, call this method to set the offset and
* length of the written array.
*/
public void arrayWriteEnd(int rowId, int offset, int length) {
public void putArrayOffsetAndLength(int rowId, int offset, int length) {
putInt(2 * rowId, offset);
putInt(2 * rowId + 1, length);
}
Expand Down Expand Up @@ -563,7 +563,7 @@ public final Array getArray(int rowId) {
*/
public int putByteArray(int rowId, byte[] value, int offset, int length) {
int result = arrayData().appendBytes(length, value, offset);
arrayWriteEnd(rowId, result, length);
putArrayOffsetAndLength(rowId, result, length);
return result;
}

Expand Down Expand Up @@ -829,13 +829,13 @@ public final int appendDoubles(int length, double[] src, int offset) {
public final int appendByteArray(byte[] value, int offset, int length) {
int copiedOffset = arrayData().appendBytes(length, value, offset);
reserve(elementsAppended + 1);
arrayWriteEnd(elementsAppended, copiedOffset, length);
putArrayOffsetAndLength(elementsAppended, copiedOffset, length);
return elementsAppended++;
}

public final int appendArray(int length) {
reserve(elementsAppended + 1);
arrayWriteEnd(elementsAppended, arrayData().elementsAppended, length);
putArrayOffsetAndLength(elementsAppended, arrayData().elementsAppended, length);
return elementsAppended++;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ public final class OffHeapColumnVector extends ColumnVector {
// The data stored in these two allocations need to maintain binary compatible. We can
// directly pass this buffer to external components.
private long nulls;
// The actually data of this column vector will be store here. If it's an array column vector,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: will be store -> will be stored ?

// we will store the offsets and lengths here, and store the element data in child column vector.
private long data;

protected OffHeapColumnVector(int capacity, DataType type) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ public final class OnHeapColumnVector extends ColumnVector {
// Array for each type. Only 1 is populated for any type.
private byte[] byteData;
private short[] shortData;
// This is not used used to store data for int column vector, but also can store offsets and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is not used used to store -> This is not used to store ?

// lengths for array column vector.
private int[] intData;
Copy link
Member

@viirya viirya Jun 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question I just have is, the capacity of ColumnVector is bound by MAX_CAPACITY. Previously we store offset and length individually, so we can have MAX_CAPACITY arrays at most. Now we store offset and length together in data/intData which is bound to MAX_CAPACITY, doesn't it say we can just have MAX_CAPACITY / 2 arrays at most?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Is it possible to use longData, which has a pair of 32-bit offset and length, to keep MAX_CAPACITY array length?

Copy link
Member

@viirya viirya Jun 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I see. We only check the limit of MAX_CAPACITY before actually going into reserveInternal.

Even we pass this check, we still face problem when allocating intData, see the comment below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kiszk Do you meant we store a pair of offset/length together as an element in longData?

private long[] longData;
private float[] floatData;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -631,7 +631,7 @@ class ColumnarBatchSuite extends SparkFunSuite {
assert(column.arrayData().elementsAppended == 17)

// Put the same "ll" at offset. This should not allocate more memory in the column.
column.arrayWriteEnd(idx, offset, 2)
column.putArrayOffsetAndLength(idx, offset, 2)
reference += "ll"
idx += 1
assert(column.arrayData().elementsAppended == 17)
Expand Down Expand Up @@ -667,10 +667,10 @@ class ColumnarBatchSuite extends SparkFunSuite {
}

// Populate it with arrays [0], [1, 2], [], [3, 4, 5]
column.arrayWriteEnd(0, 0, 1)
column.arrayWriteEnd(1, 1, 2)
column.arrayWriteEnd(2, 2, 0)
column.arrayWriteEnd(3, 3, 3)
column.putArrayOffsetAndLength(0, 0, 1)
column.putArrayOffsetAndLength(1, 1, 2)
column.putArrayOffsetAndLength(2, 2, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be column.putArrayOffsetAndLength(2, 3, 0) ?

column.putArrayOffsetAndLength(3, 3, 3)

val a1 = ColumnVectorUtils.toPrimitiveJavaArray(column.getArray(0)).asInstanceOf[Array[Int]]
val a2 = ColumnVectorUtils.toPrimitiveJavaArray(column.getArray(1)).asInstanceOf[Array[Int]]
Expand Down Expand Up @@ -703,7 +703,7 @@ class ColumnarBatchSuite extends SparkFunSuite {
data.reserve(array.length)
assert(data.capacity == array.length * 2)
data.putInts(0, array.length, array, 0)
column.arrayWriteEnd(0, 0, array.length)
column.putArrayOffsetAndLength(0, 0, array.length)
assert(ColumnVectorUtils.toPrimitiveJavaArray(column.getArray(0)).asInstanceOf[Array[Int]]
=== array)
}}
Expand Down