|
| 1 | +--- |
| 2 | +layout: default |
| 3 | +title: Apache Arrow 0.5.0 Release |
| 4 | +permalink: /release/0.5.0.html |
| 5 | +--- |
| 6 | +<!-- |
| 7 | +{% comment %} |
| 8 | +Licensed to the Apache Software Foundation (ASF) under one or more |
| 9 | +contributor license agreements. See the NOTICE file distributed with |
| 10 | +this work for additional information regarding copyright ownership. |
| 11 | +The ASF licenses this file to you under the Apache License, Version 2.0 |
| 12 | +(the "License"); you may not use this file except in compliance with |
| 13 | +the License. You may obtain a copy of the License at |
| 14 | +
|
| 15 | +http://www.apache.org/licenses/LICENSE-2.0 |
| 16 | +
|
| 17 | +Unless required by applicable law or agreed to in writing, software |
| 18 | +distributed under the License is distributed on an "AS IS" BASIS, |
| 19 | +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 20 | +See the License for the specific language governing permissions and |
| 21 | +limitations under the License. |
| 22 | +{% endcomment %} |
| 23 | +--> |
| 24 | + |
| 25 | +# Apache Arrow 0.5.0 (23 July 2017) |
| 26 | + |
| 27 | +This is a major release, with expanded features in the supported languages and |
| 28 | +additional integration test coverage between Java and C++. |
| 29 | + |
| 30 | +Read more in the [release blog post][8]. |
| 31 | + |
| 32 | +## Download |
| 33 | + |
| 34 | +* [**Source Artifacts**][6] |
| 35 | +* [Git tag][2] |
| 36 | + |
| 37 | +## Contributors |
| 38 | + |
| 39 | +```shell |
| 40 | +$ git shortlog -sn apache-arrow-0.4.1..apache-arrow-0.5.0 |
| 41 | + 42 Wes McKinney |
| 42 | + 22 Uwe L. Korn |
| 43 | + 12 Kouhei Sutou |
| 44 | + 9 Max Risuhin |
| 45 | + 9 Phillip Cloud |
| 46 | + 6 Philipp Moritz |
| 47 | + 5 Steven Phillips |
| 48 | + 3 Julien Le Dem |
| 49 | + 2 Bryan Cutler |
| 50 | + 2 Kengo Seki |
| 51 | + 2 Max Risukhin |
| 52 | + 2 fjetter |
| 53 | + 1 Antony Mayi |
| 54 | + 1 Deepak Majeti |
| 55 | + 1 Fang Zheng |
| 56 | + 1 Hideo Hattori |
| 57 | + 1 Holden Karau |
| 58 | + 1 Itai Incze |
| 59 | + 1 Jeff Knupp |
| 60 | + 1 LynnYuan |
| 61 | + 1 Mark Lavrynenko |
| 62 | + 1 Michael König |
| 63 | + 1 Robert Nishihara |
| 64 | + 1 Sudheesh Katkam |
| 65 | + 1 Zahari |
| 66 | + 1 vkorukanti |
| 67 | +``` |
| 68 | + |
| 69 | +# Changelog |
| 70 | + |
| 71 | +## New Features and Improvements |
| 72 | + |
| 73 | +* [ARROW-1041](https://issues.apache.org/jira/browse/ARROW-1041) - [Python] Support read_pandas on a directory of Parquet files |
| 74 | +* [ARROW-1048](https://issues.apache.org/jira/browse/ARROW-1048) - Allow user LD_LIBRARY_PATH to be used with source release script |
| 75 | +* [ARROW-1052](https://issues.apache.org/jira/browse/ARROW-1052) - Arrow 0.5.0 release |
| 76 | +* [ARROW-1073](https://issues.apache.org/jira/browse/ARROW-1073) - C++: Adapative integer builder |
| 77 | +* [ARROW-1095](https://issues.apache.org/jira/browse/ARROW-1095) - [Website] Add Arrow icon asset |
| 78 | +* [ARROW-1100](https://issues.apache.org/jira/browse/ARROW-1100) - [Python] Add "mode" property to NativeFile instances |
| 79 | +* [ARROW-1102](https://issues.apache.org/jira/browse/ARROW-1102) - Make MessageSerializer.serializeMessage() public |
| 80 | +* [ARROW-111](https://issues.apache.org/jira/browse/ARROW-111) - [C++] Add static analyzer to tool chain to verify checking of Status returns |
| 81 | +* [ARROW-1120](https://issues.apache.org/jira/browse/ARROW-1120) - [Python] Write support for int96 |
| 82 | +* [ARROW-1122](https://issues.apache.org/jira/browse/ARROW-1122) - [Website] Guest blog post on Arrow + ODBC from turbodbc |
| 83 | +* [ARROW-1123](https://issues.apache.org/jira/browse/ARROW-1123) - C++: Make jemalloc the default allocator |
| 84 | +* [ARROW-1135](https://issues.apache.org/jira/browse/ARROW-1135) - Upgrade Travis CI clang builds to use LLVM 4.0 |
| 85 | +* [ARROW-1137](https://issues.apache.org/jira/browse/ARROW-1137) - Python: Ensure Pandas roundtrip of all-None column |
| 86 | +* [ARROW-1142](https://issues.apache.org/jira/browse/ARROW-1142) - [C++] Move over compression library toolchain from parquet-cpp |
| 87 | +* [ARROW-1145](https://issues.apache.org/jira/browse/ARROW-1145) - [GLib] Add get_values() |
| 88 | +* [ARROW-1146](https://issues.apache.org/jira/browse/ARROW-1146) - Add .gitignore for *_generated.h files in src/plasma/format |
| 89 | +* [ARROW-1148](https://issues.apache.org/jira/browse/ARROW-1148) - [C++] Raise minimum CMake version to 3.2 |
| 90 | +* [ARROW-1151](https://issues.apache.org/jira/browse/ARROW-1151) - [C++] Add gcc branch prediction to status check macro |
| 91 | +* [ARROW-1154](https://issues.apache.org/jira/browse/ARROW-1154) - [C++] Migrate more computational utility code from parquet-cpp |
| 92 | +* [ARROW-1160](https://issues.apache.org/jira/browse/ARROW-1160) - C++: Implement DictionaryBuilder |
| 93 | +* [ARROW-1165](https://issues.apache.org/jira/browse/ARROW-1165) - [C++] Refactor PythonDecimalToArrowDecimal to not use templates |
| 94 | +* [ARROW-1172](https://issues.apache.org/jira/browse/ARROW-1172) - [C++] Use unique_ptr with array builder classes |
| 95 | +* [ARROW-1183](https://issues.apache.org/jira/browse/ARROW-1183) - [Python] Implement time type conversions in to_pandas |
| 96 | +* [ARROW-1185](https://issues.apache.org/jira/browse/ARROW-1185) - [C++] Clean up arrow::Status implementation, add warn_unused_result attribute for clang |
| 97 | +* [ARROW-1187](https://issues.apache.org/jira/browse/ARROW-1187) - Serialize a DataFrame with None column |
| 98 | +* [ARROW-1193](https://issues.apache.org/jira/browse/ARROW-1193) - [C++] Support pkg-config forarrow_python.so |
| 99 | +* [ARROW-1196](https://issues.apache.org/jira/browse/ARROW-1196) - [C++] Appveyor separate jobs for Debug/Release builds from sources; Build with conda toolchain; Build with NMake Makefiles Generator |
| 100 | +* [ARROW-1198](https://issues.apache.org/jira/browse/ARROW-1198) - Python: Add public C++ API to unwrap PyArrow object |
| 101 | +* [ARROW-1199](https://issues.apache.org/jira/browse/ARROW-1199) - [C++] Introduce mutable POD struct for generic array data |
| 102 | +* [ARROW-1202](https://issues.apache.org/jira/browse/ARROW-1202) - Remove semicolons from status macros |
| 103 | +* [ARROW-1212](https://issues.apache.org/jira/browse/ARROW-1212) - [GLib] Add garrow_binary_array_get_offsets_buffer() |
| 104 | +* [ARROW-1214](https://issues.apache.org/jira/browse/ARROW-1214) - [Python] Add classes / functions to enable stream message components to be handled outside of the stream reader class |
| 105 | +* [ARROW-1217](https://issues.apache.org/jira/browse/ARROW-1217) - [GLib] Add GInputStream based arrow::io::RandomAccessFile |
| 106 | +* [ARROW-1220](https://issues.apache.org/jira/browse/ARROW-1220) - [C++] Standartize usage of *_HOME cmake script variables for 3rd party libs |
| 107 | +* [ARROW-1221](https://issues.apache.org/jira/browse/ARROW-1221) - [C++] Pin clang-format version |
| 108 | +* [ARROW-1227](https://issues.apache.org/jira/browse/ARROW-1227) - [GLib] Support GOutputStream |
| 109 | +* [ARROW-1228](https://issues.apache.org/jira/browse/ARROW-1228) - [GLib] Test file name should be the same name as target class |
| 110 | +* [ARROW-1229](https://issues.apache.org/jira/browse/ARROW-1229) - [GLib] Follow Reader API change (get -> read) |
| 111 | +* [ARROW-1233](https://issues.apache.org/jira/browse/ARROW-1233) - [C++] Validate cmake script resolving of 3rd party linked libs from correct location in toolchain build |
| 112 | +* [ARROW-460](https://issues.apache.org/jira/browse/ARROW-460) - [C++] Implement JSON round trip for DictionaryArray |
| 113 | +* [ARROW-462](https://issues.apache.org/jira/browse/ARROW-462) - [C++] Implement in-memory conversions between non-nested primitive types and DictionaryArray equivalent |
| 114 | +* [ARROW-575](https://issues.apache.org/jira/browse/ARROW-575) - Python: Auto-detect nested lists and nested numpy arrays in Pandas |
| 115 | +* [ARROW-597](https://issues.apache.org/jira/browse/ARROW-597) - [Python] Add convenience function to yield DataFrame from any object that a StreamReader or FileReader can read from |
| 116 | +* [ARROW-599](https://issues.apache.org/jira/browse/ARROW-599) - [C++] Add LZ4 codec to 3rd-party toolchain |
| 117 | +* [ARROW-600](https://issues.apache.org/jira/browse/ARROW-600) - [C++] Add ZSTD codec to 3rd-party toolchain |
| 118 | +* [ARROW-692](https://issues.apache.org/jira/browse/ARROW-692) - Java<->C++ Integration tests for dictionary-encoded vectors |
| 119 | +* [ARROW-693](https://issues.apache.org/jira/browse/ARROW-693) - [Java] Add JSON support for dictionary vectors |
| 120 | +* [ARROW-742](https://issues.apache.org/jira/browse/ARROW-742) - Handling exceptions during execution of std::wstring_convert |
| 121 | +* [ARROW-834](https://issues.apache.org/jira/browse/ARROW-834) - [Python] Support creating Arrow arrays from Python iterables |
| 122 | +* [ARROW-915](https://issues.apache.org/jira/browse/ARROW-915) - Struct Array reads limited support |
| 123 | +* [ARROW-935](https://issues.apache.org/jira/browse/ARROW-935) - [Java] Build Javadoc in Travis CI |
| 124 | +* [ARROW-960](https://issues.apache.org/jira/browse/ARROW-960) - [Python] Add source build guide for macOS + Homebrew |
| 125 | +* [ARROW-962](https://issues.apache.org/jira/browse/ARROW-962) - [Python] Add schema attribute to FileReader |
| 126 | +* [ARROW-966](https://issues.apache.org/jira/browse/ARROW-966) - [Python] pyarrow.list_ should also accept Field instance |
| 127 | +* [ARROW-978](https://issues.apache.org/jira/browse/ARROW-978) - [Python] Use sphinx-bootstrap-theme for Sphinx documentation |
| 128 | + |
| 129 | +## Bug Fixes |
| 130 | + |
| 131 | +* [ARROW-1074](https://issues.apache.org/jira/browse/ARROW-1074) - from_pandas doesnt convert ndarray to list |
| 132 | +* [ARROW-1079](https://issues.apache.org/jira/browse/ARROW-1079) - [Python] Empty "private" directories should be ignored by Parquet interface |
| 133 | +* [ARROW-1081](https://issues.apache.org/jira/browse/ARROW-1081) - C++: arrow::test::TestBase::MakePrimitive doesn't fill null_bitmap |
| 134 | +* [ARROW-1096](https://issues.apache.org/jira/browse/ARROW-1096) - [C++] Memory mapping file over 4GB fails on Windows |
| 135 | +* [ARROW-1097](https://issues.apache.org/jira/browse/ARROW-1097) - Reading tensor needs file to be opened in writeable mode |
| 136 | +* [ARROW-1098](https://issues.apache.org/jira/browse/ARROW-1098) - Document Error? |
| 137 | +* [ARROW-1101](https://issues.apache.org/jira/browse/ARROW-1101) - UnionListWriter is not implementing all methods on interface ScalarWriter |
| 138 | +* [ARROW-1103](https://issues.apache.org/jira/browse/ARROW-1103) - [Python] Utilize pandas metadata from common _metadata Parquet file if it exists |
| 139 | +* [ARROW-1107](https://issues.apache.org/jira/browse/ARROW-1107) - [JAVA] NullableMapVector getField() should return nullable type |
| 140 | +* [ARROW-1108](https://issues.apache.org/jira/browse/ARROW-1108) - Check if ArrowBuf is empty buffer in getActualConsumedMemory() and getPossibleConsumedMemory() |
| 141 | +* [ARROW-1109](https://issues.apache.org/jira/browse/ARROW-1109) - [JAVA] transferOwnership fails when readerIndex is not 0 |
| 142 | +* [ARROW-1110](https://issues.apache.org/jira/browse/ARROW-1110) - [JAVA] make union vector naming consistent |
| 143 | +* [ARROW-1111](https://issues.apache.org/jira/browse/ARROW-1111) - [JAVA] Make aligning buffers optional, and allow -1 for unknown null count |
| 144 | +* [ARROW-1112](https://issues.apache.org/jira/browse/ARROW-1112) - [JAVA] Set lastSet for VarLength and List vectors when loading |
| 145 | +* [ARROW-1113](https://issues.apache.org/jira/browse/ARROW-1113) - [C++] gflags EP build gets triggered (as a no-op) on subsequent calls to make or ninja build |
| 146 | +* [ARROW-1115](https://issues.apache.org/jira/browse/ARROW-1115) - [C++] Use absolute path for ccache |
| 147 | +* [ARROW-1117](https://issues.apache.org/jira/browse/ARROW-1117) - [Docs] Minor issues in GLib README |
| 148 | +* [ARROW-1124](https://issues.apache.org/jira/browse/ARROW-1124) - [Python] pyarrow needs to depend on numpy>=1.10 (not 1.9) |
| 149 | +* [ARROW-1125](https://issues.apache.org/jira/browse/ARROW-1125) - Python: Table.from_pandas doesn't work anymore on partial schemas |
| 150 | +* [ARROW-1128](https://issues.apache.org/jira/browse/ARROW-1128) - [Docs] command to build a wheel is not properly rendered |
| 151 | +* [ARROW-1129](https://issues.apache.org/jira/browse/ARROW-1129) - [C++] Fix Linux toolchain build regression from ARROW-742 |
| 152 | +* [ARROW-1131](https://issues.apache.org/jira/browse/ARROW-1131) - Python: Parquet unit tests are always skipped |
| 153 | +* [ARROW-1132](https://issues.apache.org/jira/browse/ARROW-1132) - [Python] Unable to write pandas DataFrame w/MultiIndex containing duplicate values to parquet |
| 154 | +* [ARROW-1136](https://issues.apache.org/jira/browse/ARROW-1136) - [C++/Python] Segfault on empty stream |
| 155 | +* [ARROW-1138](https://issues.apache.org/jira/browse/ARROW-1138) - Travis: Use OpenJDK7 instead of OracleJDK7 |
| 156 | +* [ARROW-1139](https://issues.apache.org/jira/browse/ARROW-1139) - [C++] dlmalloc doesn't allow arrow to be built with clang 4 or gcc 7.1.1 |
| 157 | +* [ARROW-1141](https://issues.apache.org/jira/browse/ARROW-1141) - on import get libjemalloc.so.2: cannot allocate memory in static TLS block |
| 158 | +* [ARROW-1143](https://issues.apache.org/jira/browse/ARROW-1143) - C++: Fix comparison of NullArray |
| 159 | +* [ARROW-1144](https://issues.apache.org/jira/browse/ARROW-1144) - [C++] Remove unused variable |
| 160 | +* [ARROW-1147](https://issues.apache.org/jira/browse/ARROW-1147) - [C++] Allow optional vendoring of flatbuffers in plasma |
| 161 | +* [ARROW-1150](https://issues.apache.org/jira/browse/ARROW-1150) - [C++] AdaptiveIntBuilder compiler warning on MSVC |
| 162 | +* [ARROW-1152](https://issues.apache.org/jira/browse/ARROW-1152) - [Cython] read_tensor should work with a readable file |
| 163 | +* [ARROW-1155](https://issues.apache.org/jira/browse/ARROW-1155) - segmentation fault when run pa.Int16Value() |
| 164 | +* [ARROW-1157](https://issues.apache.org/jira/browse/ARROW-1157) - C++/Python: Decimal templates are not correctly exported on OSX |
| 165 | +* [ARROW-1159](https://issues.apache.org/jira/browse/ARROW-1159) - [C++] Static data members cannot be accessed from inline functions in Arrow headers by thirdparty users |
| 166 | +* [ARROW-1162](https://issues.apache.org/jira/browse/ARROW-1162) - Transfer Between Empty Lists Should Not Invoke Callback |
| 167 | +* [ARROW-1166](https://issues.apache.org/jira/browse/ARROW-1166) - Errors in Struct type's example and missing reference in Layout.md |
| 168 | +* [ARROW-1167](https://issues.apache.org/jira/browse/ARROW-1167) - [Python] Create chunked BinaryArray in Table.from_pandas when a column's data exceeds 2GB |
| 169 | +* [ARROW-1168](https://issues.apache.org/jira/browse/ARROW-1168) - [Python] pandas metadata may contain "mixed" data types |
| 170 | +* [ARROW-1169](https://issues.apache.org/jira/browse/ARROW-1169) - C++: jemalloc externalproject doesn't build with CMake's ninja generator |
| 171 | +* [ARROW-1170](https://issues.apache.org/jira/browse/ARROW-1170) - C++: ARROW_JEMALLOC=OFF breaks linking on unittest |
| 172 | +* [ARROW-1174](https://issues.apache.org/jira/browse/ARROW-1174) - [GLib] Investigate root cause of ListArray glib test failure |
| 173 | +* [ARROW-1177](https://issues.apache.org/jira/browse/ARROW-1177) - [C++] Detect int32 overflow in ListBuilder::Append |
| 174 | +* [ARROW-1179](https://issues.apache.org/jira/browse/ARROW-1179) - C++: Add missing virtual destructors |
| 175 | +* [ARROW-1180](https://issues.apache.org/jira/browse/ARROW-1180) - [GLib] garrow_tensor_get_dimension_name() returns invalid address |
| 176 | +* [ARROW-1181](https://issues.apache.org/jira/browse/ARROW-1181) - [Python] Parquet test fail if not enabled |
| 177 | +* [ARROW-1182](https://issues.apache.org/jira/browse/ARROW-1182) - C++: Specify BUILD_BYPRODUCTS for zlib and zstd |
| 178 | +* [ARROW-1186](https://issues.apache.org/jira/browse/ARROW-1186) - [C++] Enable option to build arrow with minimal dependencies needed to build Parquet library |
| 179 | +* [ARROW-1188](https://issues.apache.org/jira/browse/ARROW-1188) - Segfault when trying to serialize a DataFrame with Null-only Categorical Column |
| 180 | +* [ARROW-1190](https://issues.apache.org/jira/browse/ARROW-1190) - VectorLoader corrupts vectors with duplicate names |
| 181 | +* [ARROW-1191](https://issues.apache.org/jira/browse/ARROW-1191) - [JAVA] Implement getField() method for the complex readers |
| 182 | +* [ARROW-1194](https://issues.apache.org/jira/browse/ARROW-1194) - Getting record batch size with pa.get_record_batch_size returns a size that is too small for pandas DataFrame. |
| 183 | +* [ARROW-1197](https://issues.apache.org/jira/browse/ARROW-1197) - [GLib] record_batch.hpp Inclusion is missing |
| 184 | +* [ARROW-1200](https://issues.apache.org/jira/browse/ARROW-1200) - [C++] DictionaryBuilder should use signed integers for indices |
| 185 | +* [ARROW-1201](https://issues.apache.org/jira/browse/ARROW-1201) - [Python] Incomplete Python types cause a core dump when repr-ing |
| 186 | +* [ARROW-1203](https://issues.apache.org/jira/browse/ARROW-1203) - [C++] Disallow BinaryBuilder to append byte strings larger than the maximum value of int32_t |
| 187 | +* [ARROW-1205](https://issues.apache.org/jira/browse/ARROW-1205) - C++: Reference to type objects in ArrayLoader may cause segmentation faults. |
| 188 | +* [ARROW-1206](https://issues.apache.org/jira/browse/ARROW-1206) - [C++] Enable MSVC builds to work with some compression library support disabled |
| 189 | +* [ARROW-1208](https://issues.apache.org/jira/browse/ARROW-1208) - [C++] Toolchain build with ZSTD library from conda-forge failure |
| 190 | +* [ARROW-1215](https://issues.apache.org/jira/browse/ARROW-1215) - [Python] Class methods in API reference |
| 191 | +* [ARROW-1216](https://issues.apache.org/jira/browse/ARROW-1216) - Numpy arrays cannot be created from Arrow Buffers on Python 2 |
| 192 | +* [ARROW-1218](https://issues.apache.org/jira/browse/ARROW-1218) - Arrow doesn't compile if all compression libraries are deactivated |
| 193 | +* [ARROW-1222](https://issues.apache.org/jira/browse/ARROW-1222) - [Python] pyarrow.array returns NullArray for array of unsupported Python objects |
| 194 | +* [ARROW-1223](https://issues.apache.org/jira/browse/ARROW-1223) - [GLib] Fix function name that returns wrapped object |
| 195 | +* [ARROW-1235](https://issues.apache.org/jira/browse/ARROW-1235) - [C++] macOS linker failure with operator<< and std::ostream |
| 196 | +* [ARROW-1236](https://issues.apache.org/jira/browse/ARROW-1236) - Library paths in exported pkg-config file are incorrect |
| 197 | +* [ARROW-601](https://issues.apache.org/jira/browse/ARROW-601) - Some logical types not supported when loading Parquet |
| 198 | +* [ARROW-784](https://issues.apache.org/jira/browse/ARROW-784) - Cleaning up thirdparty toolchain support in Arrow on Windows |
| 199 | +* [ARROW-992](https://issues.apache.org/jira/browse/ARROW-992) - [Python] In place development builds do not have a __version__ |
| 200 | + |
| 201 | +[2]: https://github.com/apache/arrow/releases/tag/apache-arrow-0.5.0 |
| 202 | +[6]: https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.5.0/ |
| 203 | +[8]: http://arrow.apache.org/blog/2017/07/23/0.5.0-release/ |
0 commit comments