Release Notes - Apache Hivemall - Version 0.6.0
New Feature
Bug
- [HIVEMALL-165] - array_remove UDF throws exception when the first argument is null
- [HIVEMALL-232] - [DOC] Fix typo in the Top-K document
- [HIVEMALL-235] - Fix [Double|Int|Float|Long]ArrayList bug in expansion where size=0
- [HIVEMALL-236] - to_json cause KryoException/NullPointerException with ArrayList due to Kryo bug
- [HIVEMALL-238] - from_json UDF does not work for top-level map object
- [HIVEMALL-259] - [BUG] feature_binning does not work properly under certain conditions
- [HIVEMALL-260] - Remove dependencies to Scala library in xgboost classifier
- [HIVEMALL-268] - train_ffm(features,label) causes NPE while train_ffm(features,label,'') works fine
- [HIVEMALL-274] - Wrong target variable name in the step-by-step tutorial
Improvement
- [HIVEMALL-43] - [MIXSERV][Umbrella] ProtocolBuffer-based MixMessage serialization
- [HIVEMALL-107] - Move `spark-shell` into Docker
- [HIVEMALL-121] - Add `-libsvm` formatting option for `feature_hashing`
- [HIVEMALL-178] - NaN/missing value/null values handling in RandomForest
- [HIVEMALL-200] - Add an option to return PartOfSpeech in tokenize_ja and tokenize_cn
- [HIVEMALL-226] - Move hivemall.fm and hivemall.mf packages to under hivemall.factorization
- [HIVEMALL-230] - Revise Optimizer Implementation
- [HIVEMALL-233] - RandomForest regressor accepts sparse vector input
- [HIVEMALL-234] - Fix mismatched default value between EtaEstimator and UDF option description
- [HIVEMALL-245] - Refactor RandomForest for Sparse Data handling
- [HIVEMALL-246] - Add validation for feature UDF
- [HIVEMALL-249] - Fix fmeasure UDAF to support any integers
- [HIVEMALL-251] - Add option to return PartOfSpeech information for tokenize_ja
- [HIVEMALL-258] - Add UDF to convert feature/label in Libsvm format
- [HIVEMALL-271] - Make xgboost hyperparameters configurable
- [HIVEMALL-272] - Refine xgboost_predict implementation
- [HIVEMALL-273] - Support xgboost v0.90
- [HIVEMALL-275] - Fix xgboost module to create DMatrix from CSRMatrix
- [HIVEMALL-278] - Bump up matrix4j dependencies to v0.9.1
- [HIVEMALL-279] - Support xgboost v0.90 hyperparameters
Task
Sub-task
- [HIVEMALL-26] - Add documentation about Hivemall on Apache Spark
- [HIVEMALL-27] - Add documentation about Xgboost support in v0.6.0
- [HIVEMALL-56] - Add documentation about Similarity/Distance functions
- [HIVEMALL-158] - Refine deprecated userguide contents
- [HIVEMALL-159] - Add documentation about One-hot encoding
- [HIVEMALL-250] - Add documentation about binarize_label UDTF