forked from pentaho/hive-0.7.0
-
Notifications
You must be signed in to change notification settings - Fork 0
/
RELEASE_NOTES.txt
524 lines (510 loc) · 36.9 KB
/
RELEASE_NOTES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
Release Notes - Hive - Version 0.7.0
** New Feature
* [HIVE-78] - Authorization infrastructure for Hive
* [HIVE-417] - Implement Indexing in Hive
* [HIVE-471] - Add reflect() UDF for reflective invocation of Java methods
* [HIVE-537] - Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
* [HIVE-842] - Authentication Infrastructure for Hive
* [HIVE-1096] - Hive Variables
* [HIVE-1293] - Concurrency Model for Hive
* [HIVE-1304] - add row_sequence UDF
* [HIVE-1405] - hive command line option -i to run an init file before other SQL commands
* [HIVE-1408] - add option to let hive automatically run in local mode based on tunable heuristics
* [HIVE-1413] - bring a table/partition offline
* [HIVE-1438] - sentences() UDF for natural language tokenization
* [HIVE-1481] - ngrams() UDAF for estimating top-k n-gram frequencies
* [HIVE-1514] - Be able to modify a partition's fileformat and file location information.
* [HIVE-1518] - context_ngrams() UDAF for estimating top-k contextual n-grams
* [HIVE-1528] - Add json_tuple() UDTF function
* [HIVE-1529] - Add ANSI SQL covariance aggregate functions: covar_pop and covar_samp.
* [HIVE-1549] - Add ANSI SQL correlation aggregate function CORR(X,Y).
* [HIVE-1609] - Support partition filtering in metastore
* [HIVE-1624] - Patch to allows scripts in S3 location
* [HIVE-1636] - Implement "SHOW TABLES {FROM | IN} db_name"
* [HIVE-1659] - parse_url_tuple: a UDTF version of parse_url
* [HIVE-1661] - Default values for parameters
* [HIVE-1779] - Implement GenericUDF str_to_map
* [HIVE-1790] - Patch to support HAVING clause in Hive
* [HIVE-1792] - track the joins which are being converted to map-join automatically
* [HIVE-1818] - Call frequency and duration metrics for HiveMetaStore via jmx
* [HIVE-1819] - maintain lastAccessTime in the metastore
* [HIVE-1820] - Make Hive database data center aware
* [HIVE-1827] - Add a new local mode flag in Task.
* [HIVE-1835] - Better auto-complete for Hive
* [HIVE-1840] - Support ALTER DATABASE to change database properties
* [HIVE-1856] - Implement DROP TABLE/VIEW ... IF EXISTS
* [HIVE-1858] - Implement DROP {PARTITION, INDEX, TEMPORARY FUNCTION} IF EXISTS
* [HIVE-1881] - Make the MetaStore filesystem interface pluggable via the hive.metastore.fs.handler.class configuration property
* [HIVE-1889] - add an option (hive.index.compact.file.ignore.hdfs) to ignore HDFS location stored in index files.
* [HIVE-1971] - Verbose/echo mode for the Hive CLI
** Improvement
* [HIVE-138] - Provide option to export a HEADER
* [HIVE-474] - Support for distinct selection on two or more columns
* [HIVE-558] - describe extended table/partition output is cryptic
* [HIVE-1126] - Missing some Jdbc functionality like getTables getColumns and HiveResultSet.get* methods based on column name.
* [HIVE-1211] - Tapping logs from child processes
* [HIVE-1226] - support filter pushdown against non-native tables
* [HIVE-1229] - replace dependencies on HBase deprecated API
* [HIVE-1235] - use Ivy for fetching HBase dependencies
* [HIVE-1264] - Make Hive work with Hadoop security
* [HIVE-1378] - Return value for map, array, and struct needs to return a string
* [HIVE-1394] - do not update transient_lastDdlTime if the partition is modified by a housekeeping operation
* [HIVE-1414] - automatically invoke .hiverc init script
* [HIVE-1415] - add CLI command for executing a SQL script
* [HIVE-1430] - serializing/deserializing the query plan is useless and expensive
* [HIVE-1441] - Extend ivy offline mode to cover metastore downloads
* [HIVE-1443] - Add support to turn off bucketing with ALTER TABLE
* [HIVE-1447] - Speed up reflection method calls in GenericUDFBridge and GenericUDAFBridge
* [HIVE-1456] - potentail NullPointerException
* [HIVE-1463] - hive output file names are unnecessarily large
* [HIVE-1469] - replace isArray() calls and remove LOG.isInfoEnabled() in Operator.forward()
* [HIVE-1495] - supply correct information to hooks and lineage for index rebuild
* [HIVE-1497] - support COMMENT clause on CREATE INDEX, and add new command for SHOW INDEXES
* [HIVE-1498] - support IDXPROPERTIES on CREATE INDEX
* [HIVE-1512] - Need to get hive_hbase-handler to work with hbase versions 0.20.4 0.20.5 and cloudera CDH3 version
* [HIVE-1513] - hive starter scripts should load admin/user supplied script for configurability
* [HIVE-1517] - ability to select across a database
* [HIVE-1533] - Use ZooKeeper from maven
* [HIVE-1536] - Add support for JDBC PreparedStatements
* [HIVE-1546] - Ability to plug custom Semantic Analyzers for Hive Grammar
* [HIVE-1581] - CompactIndexInputFormat should create split only for files in the index output file.
* [HIVE-1605] - regression and improvements in handling NULLs in joins
* [HIVE-1611] - Add alternative search-provider to Hive site
* [HIVE-1616] - Add ProtocolBuffersStructObjectInspector
* [HIVE-1617] - ScriptOperator's AutoProgressor can lead to an infinite loop
* [HIVE-1622] - Use CombineHiveInputFormat for the merge job if hive.merge.mapredfiles=true
* [HIVE-1638] - convert commonly used udfs to generic udfs
* [HIVE-1641] - add map joined table to distributed cache
* [HIVE-1642] - Convert join queries to map-join based on size of table/row
* [HIVE-1645] - ability to specify parent directory for zookeeper lock manager
* [HIVE-1655] - Adding consistency check at jobClose() when committing dynamic partitions
* [HIVE-1660] - Change get_partitions_ps to pass partition filter to database
* [HIVE-1692] - FetchOperator.getInputFormatFromCache hides causal exception
* [HIVE-1701] - drop support for pre-0.20 Hadoop versions
* [HIVE-1704] - remove Hadoop 0.17 specific test reference logs
* [HIVE-1738] - Optimize Key Comparison in GroupByOperator
* [HIVE-1743] - Group-by to determine equals of Keys in reverse order
* [HIVE-1746] - Support for using ALTER to set IDXPROPERTIES
* [HIVE-1749] - ExecMapper and ExecReducer: reduce function calls to l4j.isInfoEnabled()
* [HIVE-1750] - Remove Partition Filtering Conditions when Possible
* [HIVE-1751] - Optimize ColumnarStructObjectInspector.getStructFieldData()
* [HIVE-1754] - Remove JDBM component from Map Join
* [HIVE-1757] - test cleanup for Hive-1641
* [HIVE-1758] - optimize group by hash map memory
* [HIVE-1761] - Support show locks for a particular table
* [HIVE-1765] - Add queryid while locking
* [HIVE-1768] - Update transident_lastDdlTime only if not specified
* [HIVE-1782] - add more debug information for hive locking
* [HIVE-1783] - CommonJoinOperator optimize the case of 1:1 join
* [HIVE-1785] - change Pre/Post Query Hooks to take in 1 parameter: HookContext
* [HIVE-1786] - Improve documentation for str_to_map() UDF
* [HIVE-1787] - optimize the code path when there are no outer joins
* [HIVE-1796] - dumps time at which lock was taken along with the queryid in show locks <T> extended
* [HIVE-1797] - Compressed the hashtable dump file before put into distributed cache
* [HIVE-1798] - Clear empty files in Hive
* [HIVE-1801] - HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
* [HIVE-1811] - Show the time the local task takes
* [HIVE-1824] - create a new ZooKeeper instance when retrying lock, and more info for debug
* [HIVE-1831] - Add a option to run task to check map-join possibility in non-local mode
* [HIVE-1834] - more debugging for locking
* [HIVE-1843] - add an option in dynamic partition inserts to throw an error if 0 partitions are created
* [HIVE-1852] - Reduce unnecessary DFSClient.rename() calls
* [HIVE-1855] - Include Process ID in the log4j log file name
* [HIVE-1865] - redo zookeeper hive lock manager
* [HIVE-1899] - add a factory method for creating a synchronized wrapper for IMetaStoreClient
* [HIVE-1900] - a mapper should be able to span multiple partitions
* [HIVE-1907] - Store jobid in ExecDriver
* [HIVE-1910] - Provide config parameters to control cache object pinning
* [HIVE-1923] - Allow any type of stats publisher and aggregator in addition to HBase and JDBC
* [HIVE-1929] - Find a way to disable owner grants
* [HIVE-1931] - Improve the implementation of the METASTORE_CACHE_PINOBJTYPES config
* [HIVE-1948] - Have audit logging in the Metastore
* [HIVE-1956] - "Provide DFS initialization script for Hive
* [HIVE-1961] - Make Stats gathering more flexible with timeout and atomicity
* [HIVE-1962] - make a libthrift.jar and libfb303.jar in dist package for backward compatibility
* [HIVE-1970] - Modify build to run all tests regardless of subproject failures
* [HIVE-1978] - Hive SymlinkTextInputFormat does not estimate input size correctly
** Bug
* [HIVE-307] - "LOAD DATA LOCAL INPATH" fails when the table already contains a file of the same name
* [HIVE-741] - NULL is not handled correctly in join
* [HIVE-1203] - HiveInputFormat.getInputFormatFromCache "swallows" cause exception when throwing IOExcpetion
* [HIVE-1305] - add progress in join and groupby
* [HIVE-1376] - Simple UDAFs with more than 1 parameter crash on empty row query
* [HIVE-1385] - UDF field() doesn't work
* [HIVE-1416] - Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode
* [HIVE-1422] - skip counter update when RunningJob.getCounters() returns null
* [HIVE-1440] - FetchOperator(mapjoin) does not work with RCFile
* [HIVE-1448] - bug in 'set fileformat'
* [HIVE-1453] - Make Eclipse launch templates auto-adjust to Hive version number changes
* [HIVE-1462] - Reporting progress in FileSinkOperator works in multiple directory case
* [HIVE-1465] - hive-site.xml ${user.name} not replaced for local-file derby metastore connection URL
* [HIVE-1470] - percentile_approx() fails with more than 1 reducer
* [HIVE-1471] - CTAS should unescape the column name in the select-clause.
* [HIVE-1473] - plan file should have a high replication factor
* [HIVE-1475] - .gitignore files being placed in test warehouse directories causing build failure
* [HIVE-1489] - TestCliDriver -Doverwrite=true does not put the file in the correct directory
* [HIVE-1491] - fix or disable loadpart_err.q
* [HIVE-1494] - Index followup: remove sort by clause and fix a bug in collect_set udaf
* [HIVE-1501] - when generating reentrant INSERT for index rebuild, quote identifiers using backticks
* [HIVE-1508] - Add cleanup method to HiveHistory class
* [HIVE-1509] - Monitor the working set of the number of files
* [HIVE-1510] - HiveCombineInputFormat should not use prefix matching to find the partitionDesc for a given path
* [HIVE-1520] - hive.mapred.local.mem should only be used in case of local mode job submissions
* [HIVE-1523] - ql tests no longer work in miniMR mode
* [HIVE-1532] - Replace globStatus with listStatus inside Hive.java's replaceFiles.
* [HIVE-1534] - Join filters do not work correctly with outer joins
* [HIVE-1535] - alter partition should throw exception if the specified partition does not exist.
* [HIVE-1547] - Unarchiving operation throws NPE
* [HIVE-1548] - populate inputs and outputs for all statements
* [HIVE-1556] - Fix TestContribCliDriver test
* [HIVE-1561] - smb_mapjoin_8.q returns different results in miniMr mode
* [HIVE-1563] - HBase tests broken
* [HIVE-1564] - bucketizedhiveinputformat.q fails in minimr mode
* [HIVE-1570] - referencing an added file by it's name in a transform script does not work in hive local mode
* [HIVE-1578] - Add conf. property hive.exec.show.job.failure.debug.info to enable/disable displaying link to the task with most failures
* [HIVE-1580] - cleanup ExecDriver.progress
* [HIVE-1583] - Hive should not override Hadoop specific system properties
* [HIVE-1584] - wrong log files in contrib client positive
* [HIVE-1589] - Add HBase/ZK JARs to Eclipse classpath
* [HIVE-1593] - udtf_explode.q is an empty file
* [HIVE-1598] - use SequenceFile rather than TextFile format for hive query results
* [HIVE-1600] - need to sort hook input/output lists for test result determinism
* [HIVE-1601] - Hadoop 0.17 ant test broken by HIVE-1523
* [HIVE-1606] - For a null value in a string column, JDBC driver returns the string "NULL"
* [HIVE-1607] - Reinstate and deprecate IMetaStoreClient methods removed in HIVE-675
* [HIVE-1614] - UDTF json_tuple should return null row when input is not a valid JSON string
* [HIVE-1628] - Fix Base64TextInputFormat to be compatible with commons codec 1.4
* [HIVE-1629] - Patch to fix hashCode method in DoubleWritable class
* [HIVE-1630] - bug in NO_DROP
* [HIVE-1633] - CombineHiveInputFormat fails with "cannot find dir for emptyFile"
* [HIVE-1639] - ExecDriver.addInputPaths() error if partition name contains a comma
* [HIVE-1647] - Incorrect initialization of thread local variable inside IOContext ( implementation is not threadsafe )
* [HIVE-1650] - TestContribNegativeCliDriver fails
* [HIVE-1656] - All TestJdbcDriver test cases fail in Eclipse unless a property is added in run config
* [HIVE-1657] - join results are displayed wrongly for some complex joins using select *
* [HIVE-1658] - Fix describe * [extended] column formatting
* [HIVE-1663] - ql/src/java/org/apache/hadoop/hive/ql/parse/SamplePruner.java is empty
* [HIVE-1664] - Eclipse build broken
* [HIVE-1670] - MapJoin throws EOFExeption when the mapjoined table has 0 column selected
* [HIVE-1671] - multithreading on Context.pathToCS
* [HIVE-1673] - Create table bug causes the row format property lost when serde is specified.
* [HIVE-1674] - count(*) returns wrong result when a mapper returns empty results
* [HIVE-1678] - NPE in MapJoin
* [HIVE-1688] - In the MapJoinOperator, the code uses tag as alias, which is not always true
* [HIVE-1691] - ANALYZE TABLE command should check columns in partition spec
* [HIVE-1699] - incorrect partition pruning ANALYZE TABLE
* [HIVE-1707] - bug when different partitions are present in different dfs
* [HIVE-1711] - CREATE TABLE LIKE should not set stats in the new table
* [HIVE-1712] - Migrating metadata from derby to mysql thrown NullPointerException
* [HIVE-1713] - duplicated MapRedTask in Multi-table inserts mixed with FileSinkOperator and ReduceSinkOperator
* [HIVE-1716] - make TestHBaseCliDriver use dynamic ports to avoid conflicts with already-running services
* [HIVE-1717] - ant clean should delete stats database
* [HIVE-1720] - hbase_stats.q is failing
* [HIVE-1737] - Two Bugs for Estimating Row Sizes in GroupByOperator
* [HIVE-1742] - Fix Eclipse templates (and use Ivy metadata to generate Eclipse library dependencies)
* [HIVE-1748] - Statistics broken for tables with size in excess of Integer.MAX_VALUE
* [HIVE-1753] - HIVE 1633 hit for Stage2 jobs with CombineHiveInputFormat
* [HIVE-1756] - failures in fatal.q in TestNegativeCliDriver
* [HIVE-1759] - Many important broken links on Hive web page
* [HIVE-1760] - Mismatched open/commit transaction calls in case of connection retry
* [HIVE-1767] - Merge files does not work with dynamic partition
* [HIVE-1769] - pcr.q output is non-deterministic
* [HIVE-1771] - ROUND(infinity) chokes
* [HIVE-1775] - Assertation on inputObjInspectors.length in Groupy operator
* [HIVE-1776] - parallel execution and auto-local mode combine to place plan file in wrong file system
* [HIVE-1777] - Outdated comments for GenericUDTF.close()
* [HIVE-1780] - Typo in hive-default.xml
* [HIVE-1781] - outputs not populated for dynamic partitions at compile time
* [HIVE-1794] - GenericUDFOr and GenericUDFAnd cannot receive boolean typed object
* [HIVE-1795] - outputs not correctly populated for alter table
* [HIVE-1804] - Mapjoin will fail if there are no files associating with the join tables
* [HIVE-1806] - The merge criteria on dynamic partitons should be per partiton
* [HIVE-1807] - No Element found exception in BucketMapJoinOptimizer
* [HIVE-1808] - bug in auto_join25.q
* [HIVE-1809] - Hive comparison operators are broken for NaN values
* [HIVE-1812] - spurious rmr failure messages when inserting with dynamic partitioning
* [HIVE-1828] - show locks should not use getTable()/getPartition
* [HIVE-1829] - Fix intermittent failures in TestRemoteMetaStore
* [HIVE-1830] - mappers in group followed by joins may die OOM
* [HIVE-1844] - Hanging hive client caused by TaskRunner's OutOfMemoryError
* [HIVE-1845] - Some attributes in the Eclipse template file is deprecated
* [HIVE-1846] - change hive assumption that local mode mappers/reducers always run in same jvm
* [HIVE-1848] - bug in MAPJOIN
* [HIVE-1849] - add more logging to partition pruning
* [HIVE-1853] - downgrade JDO version
* [HIVE-1854] - Temporarily disable metastore tests for listPartitionsByFilter()
* [HIVE-1857] - mixed case tablename on lefthand side of LATERAL VIEW results in query failing with confusing error message
* [HIVE-1860] - Hive's smallint datatype is not supported by the Hive JDBC driver
* [HIVE-1861] - Hive's float datatype is not supported by the Hive JDBC driver
* [HIVE-1862] - Revive partition filtering in the Hive MetaStore
* [HIVE-1863] - Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
* [HIVE-1864] - test load_overwrite.q fails
* [HIVE-1867] - Add mechanism for disabling tests with intermittent failures
* [HIVE-1870] - TestRemoteHiveMetaStore.java accidentally deleted during commit of HIVE-1845
* [HIVE-1871] - bug introduced by HIVE-1806
* [HIVE-1873] - Fix 'tar' build target broken in HIVE-1526
* [HIVE-1874] - fix HBase filter pushdown broken by HIVE-1638
* [HIVE-1878] - Set the version of Hive trunk to '0.7.0-SNAPSHOT' to avoid confusing it with a release
* [HIVE-1896] - HBase and Contrib JAR names are missing version numbers
* [HIVE-1897] - Alter command execution "when HDFS is down" results in holding stale data in MetaStore
* [HIVE-1902] - create script for the metastore upgrade due to HIVE-78
* [HIVE-1903] - Can't join HBase tables if one's name is the beginning of the other
* [HIVE-1908] - FileHandler leak on partial iteration of the resultset.
* [HIVE-1912] - Double escaping special chars when removing old partitions in rmr
* [HIVE-1913] - use partition level serde properties
* [HIVE-1914] - failures in testhbaseclidriver
* [HIVE-1915] - authorization on database level is broken.
* [HIVE-1917] - CTAS (create-table-as-select) throws exception when showing results
* [HIVE-1927] - Fix TestHadoop20SAuthBridge failure on Hudson
* [HIVE-1928] - GRANT/REVOKE should handle privileges as tokens, not identifiers
* [HIVE-1934] - alter table rename messes the location
* [HIVE-1936] - hive.semantic.analyzer.hook cannot have multiple values
* [HIVE-1939] - Fix test failure in TestContribCliDriver/url_hook.q
* [HIVE-1944] - dynamic partition insert creating different directories for the same partition during merge
* [HIVE-1951] - input16_cc.q is failing in testminimrclidriver
* [HIVE-1952] - fix some outputs and make some tests deterministic
* [HIVE-1964] - add fully deterministic ORDER BY in test union22.q and input40.q
* [HIVE-1969] - TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
* [HIVE-1979] - fix hbase_bulk.m by setting HiveInputFormat
* [HIVE-1981] - TestHadoop20SAuthBridge failed on current trunk
* [HIVE-1995] - Mismatched open/commit transaction calls when using get_partition()
* [HIVE-1998] - Update README.txt and add missing ASF headers
* [HIVE-2007] - Executing queries using Hive Server is not logging to the log file specified in hive-log4j.properties
* [HIVE-2010] - Improve naming and README files for MetaStore upgrade scripts
* [HIVE-2011] - upgrade-0.6.0.mysql.sql script attempts to increase size of PK COLUMNS.TYPE_NAME to 4000
* [HIVE-2059] - Add datanucleus.identifierFactory property to HiveConf to avoid unintentional MetaStore Schema corruption
* [HIVE-2064] - Make call to SecurityUtil.getServerPrincipal unambiguous
** Sub-task
* [HIVE-1361] - table/partition level statistics
* [HIVE-1696] - Add delegation token support to metastore
* [HIVE-1810] - a followup patch for changing the description of hive.exec.pre/post.hooks in conf/hive-default.xml
* [HIVE-1823] - upgrade the database thrift interface to allow parameters key-value pairs
* [HIVE-1836] - Extend the CREATE DATABASE command with DBPROPERTIES
* [HIVE-1842] - Add the local flag to all the map red tasks, if the query is running locally.
** Task
* [HIVE-1526] - Hive should depend on a release version of Thrift
* [HIVE-1817] - Remove Hive dependency on unreleased commons-cli 2.0 Snapshot
* [HIVE-1876] - Update Metastore upgrade scripts to handle schema changes introduced in HIVE-1413
* [HIVE-1882] - Remove CHANGES.txt
* [HIVE-1904] - Create MetaStore schema upgrade scripts for changes made in HIVE-417
* [HIVE-1905] - Provide MetaStore schema upgrade scripts for changes made in HIVE-1823
** Test
* [HIVE-1464] - improve test query performance
* [HIVE-1755] - JDBM diff in test caused by Hive-1641
* [HIVE-1774] - merge_dynamic_part's result is not deterministic
* [HIVE-1942] - change the value of hive.input.format to CombineHiveInputFormat for tests
Release Notes - Hive - Version 0.6.0
** New Feature
* [HIVE-259] - Add PERCENTILE aggregate function
* [HIVE-675] - add database/schema support Hive QL
* [HIVE-705] - Hive HBase Integration (umbrella)
* [HIVE-801] - row-wise IN would be useful
* [HIVE-862] - CommandProcessor should return DriverResponse
* [HIVE-894] - add udaf max_n, min_n to contrib
* [HIVE-917] - Bucketed Map Join
* [HIVE-972] - support views
* [HIVE-1002] - multi-partition inserts
* [HIVE-1027] - Create UDFs for XPath expression evaluation
* [HIVE-1032] - Better Error Messages for Execution Errors
* [HIVE-1087] - Let user script write out binary data into a table
* [HIVE-1121] - CombinedHiveInputFormat for hadoop 19
* [HIVE-1127] - Add UDF to create struct
* [HIVE-1131] - Add column lineage information to the pre execution hooks
* [HIVE-1132] - Add metastore API method to get partition by name
* [HIVE-1134] - bucketing mapjoin where the big table contains more than 1 big partition
* [HIVE-1178] - enforce bucketing for a table
* [HIVE-1179] - Add UDF array_contains
* [HIVE-1193] - ensure sorting properties for a table
* [HIVE-1194] - sorted merge join
* [HIVE-1197] - create a new input format where a mapper spans a file
* [HIVE-1219] - More robust handling of metastore connection failures
* [HIVE-1238] - Get partitions with a partial specification
* [HIVE-1255] - Add mathematical UDFs PI, E, degrees, radians, tan, sign, and atan
* [HIVE-1270] - Thread pool size in Thrift metastore server should be configurable
* [HIVE-1272] - Add SymlinkTextInputFormat to Hive
* [HIVE-1278] - Partition name to values conversion conversion method
* [HIVE-1307] - More generic and efficient merge method
* [HIVE-1332] - Archiving partitions
* [HIVE-1351] - Tool to cat rcfiles
* [HIVE-1397] - histogram() UDAF for a numerical column
* [HIVE-1401] - Web Interface can ony browse default
* [HIVE-1410] - Add TCP keepalive option for the metastore server
* [HIVE-1439] - Alter the number of buckets for a table
** Bug
* [HIVE-287] - support count(*) and count distinct on multiple columns
* [HIVE-763] - getSchema returns invalid column names, getThriftSchema does not return old style string schemas
* [HIVE-1011] - GenericUDTFExplode() throws NPE when given nulls
* [HIVE-1022] - desc Table should work
* [HIVE-1029] - typedbytes does not support nulls
* [HIVE-1042] - function in a transform with more than 1 argument fails
* [HIVE-1056] - Predicate push down does not work with UDTF's
* [HIVE-1064] - NPE when operating HiveCLI in distributed mode
* [HIVE-1066] - TestContribCliDriver failure in serde_typedbytes.q, serde_typedbytes2.q, and serde_typedbytes3.q
* [HIVE-1075] - Make it possible for users to recover data when moveTask fails
* [HIVE-1085] - ColumnarSerde should not be the default Serde when user specified a fileformat using 'stored as'.
* [HIVE-1086] - Add "-Doffline=true" option to ant
* [HIVE-1090] - Skew Join does not work in distributed env.
* [HIVE-1092] - Conditional task does not increase finished job counter when filter job out.
* [HIVE-1094] - Disable streaming last table if there is a skew key in previous tables.
* [HIVE-1116] - bug with alter table rename when table has property EXTERNAL=FALSE
* [HIVE-1124] - create view should expand the query text consistently
* [HIVE-1125] - Hive CLI shows 'Ended Job=' at the beginning of the job
* [HIVE-1129] - Assertion in ExecDriver.execute when assertions are enabled in HADOOP_OPTS
* [HIVE-1142] - "datanucleus" typos in conf/hive-default.xml
* [HIVE-1167] - Use TreeMap instead of Property to make explain extended deterministic
* [HIVE-1174] - Job counter error if "hive.merge.mapfiles" equals true
* [HIVE-1176] - 'create if not exists' fails for a table name with 'select' in it
* [HIVE-1184] - Expression Not In Group By Key error is sometimes masked
* [HIVE-1185] - Fix RCFile resource leak when opening a non-RCFile
* [HIVE-1195] - Increase ObjectInspector[] length on demand
* [HIVE-1200] - Fix CombineHiveInputFormat to work with multi-level of directories in a single table/partition
* [HIVE-1204] - typedbytes: writing to stderr kills the mapper
* [HIVE-1205] - RowContainer should flush out dummy rows when the table desc is null
* [HIVE-1207] - ScriptOperator AutoProgressor does not set the interval
* [HIVE-1242] - CombineHiveInputFormat does not work for compressed text files
* [HIVE-1247] - hints cannot be passed to transform statements
* [HIVE-1252] - Task breaking bug when breaking after a filter operator
* [HIVE-1253] - date_sub() function returns wrong date because of daylight saving time difference
* [HIVE-1257] - joins between HBase tables and other tables (whether HBase or not) are broken
* [HIVE-1258] - set merge files to files when bucketing/sorting is being enforced
* [HIVE-1261] - ql.metadata.Hive#close() should check for null metaStoreClient
* [HIVE-1268] - Cannot start metastore thrift server on a specific port
* [HIVE-1271] - Case sensitiveness of type information specified when using custom reducer causes type mismatch
* [HIVE-1273] - UDF_Percentile NullPointerException
* [HIVE-1274] - bug in sort merge join if the big table does not have any row
* [HIVE-1275] - TestHBaseCliDriver hangs
* [HIVE-1277] - Select query with specific projection(s) fails if the local file system directory for ${hive.user.scratchdir} does not exist.
* [HIVE-1280] - problem in combinehiveinputformat with nested directories
* [HIVE-1281] - Bucketing column names in create table should be case-insensitive
* [HIVE-1286] - error/info message being emitted on standard output
* [HIVE-1290] - sort merge join does not work with bucketizedhiveinputformat
* [HIVE-1291] - Fix UDAFPercentile ndexOutOfBoundsException
* [HIVE-1294] - HIVE_AUX_JARS_PATH interferes with startup of Hive Web Interface
* [HIVE-1298] - unit test symlink_text_input_format.q needs ORDER BY for determinism
* [HIVE-1308] - <boolean> = <boolean> throws NPE
* [HIVE-1311] - bug is use of hadoop supports splittable
* [HIVE-1312] - hive trunk does not compile with hadoop 0.17 any more
* [HIVE-1315] - bucketed sort merge join breaks after dynamic partition insert
* [HIVE-1317] - CombineHiveInputFormat throws exception when partition name contains special characters to URI
* [HIVE-1320] - NPE with lineage in a query of union alls on joins.
* [HIVE-1321] - bugs with temp directories, trailing blank fields in HBase bulk load
* [HIVE-1322] - Cached FileSystem can lead to persistant IOExceptions
* [HIVE-1323] - leading dash in partition name is not handled properly
* [HIVE-1325] - dynamic partition insert should throw an exception if the number of target table columns + dynamic partition columns does not equal to the number of select columns
* [HIVE-1326] - RowContainer uses hard-coded '/tmp/' path for temporary files
* [HIVE-1327] - Group by partition column returns wrong results
* [HIVE-1330] - fatal error check omitted for reducer-side operators
* [HIVE-1331] - select * does not work if different partitions contain different formats
* [HIVE-1338] - Fix bin/ext/jar.sh to work with hadoop 0.20 and above
* [HIVE-1341] - Filter Operator Column Pruning should preserve the column order
* [HIVE-1345] - TypedBytesSerDe fails to create table with multiple columns.
* [HIVE-1350] - hive.query.id is not unique
* [HIVE-1352] - rcfilecat should use '\t' to separate columns and print '\r\n' at the end of each row.
* [HIVE-1353] - load_dyn_part*.q tests need ORDER BY for determinism
* [HIVE-1354] - partition level properties honored if it exists
* [HIVE-1364] - Increase the maximum length of various metastore fields, and remove TYPE_NAME from COLUMNS primary key
* [HIVE-1365] - Bug in SMBJoinOperator which may causes a final part of the results in some cases.
* [HIVE-1366] - inputFileFormat error if the merge job takes a different input file format than the default output file format
* [HIVE-1371] - remove blank in rcfilecat
* [HIVE-1373] - Missing connection pool plugin in Eclipse classpath
* [HIVE-1377] - getPartitionDescFromPath() in CombineHiveInputFormat should handle matching by path
* [HIVE-1388] - combinehiveinputformat does not work if files are of different types
* [HIVE-1403] - Reporting progress to JT during closing files in FileSinkOperator
* [HIVE-1407] - Add hadoop-*-tools.jar to Eclipse classpath
* [HIVE-1409] - File format information is retrieved from first partition
* [HIVE-1411] - DataNucleus throws NucleusException if core-3.1.1 JAR appears more than once on CLASSPATH
* [HIVE-1412] - CombineHiveInputFormat bug on tablesample
* [HIVE-1417] - Archived partitions throw error with queries calling getContentSummary
* [HIVE-1418] - column pruning not working with lateral view
* [HIVE-1420] - problem with sequence and rcfiles are mixed for null partitions
* [HIVE-1421] - problem with sequence and rcfiles are mixed for null partitions
* [HIVE-1425] - hive.task.progress should be added to conf/hive-default.xml
* [HIVE-1428] - ALTER TABLE ADD PARTITION fails with a remote Thrift metastore
* [HIVE-1435] - Upgraded naming scheme causes JDO exceptions
* [HIVE-1448] - bug in 'set fileformat'
* [HIVE-1454] - insert overwrite and CTAS fail in hive local mode
* [HIVE-1455] - lateral view does not work with column pruning
* [HIVE-1492] - FileSinkOperator should remove duplicated files from the same task based on file sizes
* [HIVE-1524] - parallel execution failed if mapred.job.name is set
* [HIVE-1594] - Typo of hive.merge.size.smallfiles.avgsize prevents change of value
* [HIVE-1613] - hive --service jar looks for hadoop version but was not defined
* [HIVE-1615] - Web Interface JSP needs Refactoring for removed meta store methods
* [HIVE-1681] - ObjectStore.commitTransaction() does not properly handle transactions that have already been rolled back
* [HIVE-1697] - Migration scripts should increase size of PARAM_VALUE in PARTITION_PARAMS
** Improvement
* [HIVE-543] - provide option to run hive in local mode
* [HIVE-964] - handle skewed keys for a join in a separate job
* [HIVE-990] - Incorporate CheckStyle into Hive's build.xml
* [HIVE-1047] - Merge tasks in GenMRUnion1
* [HIVE-1068] - CREATE VIEW followup: add a "table type" enum attribute in metastore's MTable, and also null out irrelevant attributes for MTable instances which describe views
* [HIVE-1069] - CREATE VIEW followup: find and document current expected version of thrift, and regenerate code to match
* [HIVE-1093] - Add a "skew join map join size" variable to control the input size of skew join's following map join job.
* [HIVE-1102] - make number of concurrent tasks configurable
* [HIVE-1108] - QueryPlan to be independent from BaseSemanticAnalyzer
* [HIVE-1109] - Structured temporary directories
* [HIVE-1110] - add counters to show that skew join triggered
* [HIVE-1117] - Make QueryPlan serializable
* [HIVE-1118] - Add hive.merge.size.per.task to HiveConf
* [HIVE-1119] - Make all Tasks and Works serializable
* [HIVE-1120] - In ivy offline mode, don't delete downloaded jars
* [HIVE-1122] - Make ql/metadata/Table and Partition serializable
* [HIVE-1128] - Let max/min handle complex types like struct
* [HIVE-1136] - add type-checking setters for HiveConf class to match existing getters
* [HIVE-1144] - CREATE VIEW followup: support ALTER TABLE SET TBLPROPERTIES on views
* [HIVE-1150] - Add comment to explain why we check for dir first in add_partitions().
* [HIVE-1152] - Add metastore API method to drop partition / append partition by name
* [HIVE-1164] - drop_partition_by_name() should use drop_partition_common()
* [HIVE-1190] - Configure build to download Hadoop tarballs from Facebook mirror instead of Apache
* [HIVE-1198] - When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
* [HIVE-1212] - Explicitly say "Hive Internal Error" to ease debugging
* [HIVE-1216] - Show the row with error in mapper/reducer
* [HIVE-1220] - accept TBLPROPERTIES on CREATE TABLE/VIEW
* [HIVE-1228] - allow HBase key column to be anywhere in Hive table
* [HIVE-1241] - add pre-drops in bucketmapjoin*.q
* [HIVE-1244] - add backward-compatibility constructor to HiveMetaStoreClient
* [HIVE-1246] - mapjoin followed by another mapjoin should be performed in a single query
* [HIVE-1260] - from_unixtime should implment a overloading function to accept only bigint type
* [HIVE-1276] - optimize bucketing
* [HIVE-1295] - facilitate HBase bulk loads from Hive
* [HIVE-1296] - CLI set and set -v commands should dump properties in alphabetical order
* [HIVE-1297] - error message in Hive.checkPaths dumps Java array address instead of path string
* [HIVE-1300] - support: alter table touch partition
* [HIVE-1306] - cleanup the jobscratchdir
* [HIVE-1316] - Increase the memory limit for CLI client
* [HIVE-1328] - make mapred.input.dir.recursive work for select *
* [HIVE-1329] - for ALTER TABLE t SET TBLPROPERTIES ('EXTERNAL'='TRUE'), change TBL_TYPE attribute from MANAGED_TABLE to EXTERNAL_TABLE
* [HIVE-1335] - DataNucleus should use connection pooling
* [HIVE-1348] - Moving inputFileChanged() from ExecMapper to where it is needed
* [HIVE-1349] - Do not pull counters of non initialized jobs
* [HIVE-1355] - Hive should use NullOutputFormat for hadoop jobs
* [HIVE-1357] - CombineHiveInputSplit should initialize the inputFileFormat once for a single split
* [HIVE-1372] - New algorithm for variance() UDAF
* [HIVE-1383] - allow HBase WAL to be disabled
* [HIVE-1387] - Add PERCENTILE_APPROX which works with double data type
* [HIVE-1531] - Make Hive build work with Ivy versions < 2.1.0
* [HIVE-1543] - set abort in ExecMapper when Hive's record reader got an IOException
* [HIVE-1693] - Make the compile target depend on thrift.home
** Task
* [HIVE-1081] - Automated source code cleanup
* [HIVE-1084] - Cleanup Class names
* [HIVE-1103] - Add .gitignore file
* [HIVE-1104] - Suppress Checkstyle warnings for generated files
* [HIVE-1112] - Replace instances of StringBuffer/Vector with StringBuilder/ArrayList
* [HIVE-1123] - Checkstyle fixes
* [HIVE-1135] - Use Anakia for version controlled documentation
* [HIVE-1137] - build references IVY_HOME incorrectly
* [HIVE-1147] - Update Eclipse project configuration to match Checkstyle
* [HIVE-1163] - Eclipse launchtemplate changes to enable debugging
* [HIVE-1256] - fix Hive logo img tag to avoid stretching
* [HIVE-1427] - Provide metastore schema migration scripts (0.5 -> 0.6)
* [HIVE-1709] - Provide Postgres metastore schema migration scripts (0.5 -> 0.6)
* [HIVE-1725] - Include metastore upgrade scripts in release tarball
* [HIVE-1726] - Update README file for 0.6.0 release
* [HIVE-1729] - Satisfy ASF release management requirements
** Sub-task
* [HIVE-1340] - checking VOID type for NULL in LazyBinarySerde
** Test
* [HIVE-1188] - NPE when running TestJdbcDriver/TestHiveServer
* [HIVE-1236] - test HBase input format plus CombinedHiveInputFormat
* [HIVE-1279] - temporarily disable HBase test execution
* [HIVE-1359] - Unit test should be shim-aware