Public API

fit_evotree

MLJModelInterface.fit — Function

fit(
    params::EvoTypes, 
    dtrain;
    target_name,
    feature_names=nothing,
    weight_name=nothing,
    offset_name=nothing,
    deval=nothing,
    print_every_n=9999,
    verbosity=1

)

Main training function. Performs model fitting given configuration params, dtrain, target_name and other optional kwargs.

Arguments

params::EvoTypes: configuration info providing hyper-paramters. EvoTypes can be one of:
dtrain: A Tables compatible training data (named tuples, DataFrame...) containing features and target variables.

Keyword arguments

target_name: name of the target variable.
feature_names = nothing: the names dtrain variables to use as features. If not provided, it deafults to all variables that aren't one of target, weight or offset`.
weight_name = nothing: name of the variable containing weights. If nothing, common weights on one will be used.
offset_name = nothing: name of the offset variable.
deval: A Tables compatible evaluation data containing features and target variables.
print_every_n: sets at which frequency logging info should be printed.
verbosity: set to 1 to print logging info during training.

source

fit(
    params::EvoTypes{L};
    x_train::AbstractMatrix, 
    y_train::AbstractVector, 
    w_train=nothing, 
    offset_train=nothing,
    x_eval=nothing, 
    y_eval=nothing, 
    w_eval=nothing, 
    offset_eval=nothing,
    feature_names=nothing,
    early_stopping_rounds=9999,
    print_every_n=9999,
    verbosity=1)

Main training function. Performs model fitting given configuration params, x_train, y_train and other optional kwargs.

Arguments

params::EvoTypes: configuration info providing hyper-paramters. EvoTypes can be one of:

Keyword arguments

x_train::Matrix: training data of size [#observations, #features].
y_train::Vector: vector of train targets of length #observations.
w_train::Vector: vector of train weights of length #observations. If nothing, a vector of ones is assumed.
offset_train::VecOrMat: offset for the training data. Should match the size of the predictions.
x_eval::Matrix: evaluation data of size [#observations, #features].
y_eval::Vector: vector of evaluation targets of length #observations.
w_eval::Vector: vector of evaluation weights of length #observations. Defaults to nothing (assumes a vector of 1s).
offset_eval::VecOrMat: evaluation data offset. Should match the size of the predictions.
feature_names = nothing: the names of the x_train features. If provided, should be a vector of string with length(feature_names) = size(x_train, 2).
print_every_n: sets at which frequency logging info should be printed.
verbosity: set to 1 to print logging info during training.

source

predict

MLJModelInterface.predict — Function

predict(m::EvoTree, data; ntree_limit=length(m.trees), device=:cpu)

Predictions from an EvoTree model - sums the predictions from all trees composing the model. Use ntree_limit=N to only predict with the first N trees.

source

shap

EvoTrees.Shap.shap — Function

shap(m::EvoTree, data; ntree_limit=length(m.trees))

Returns the shap effect as a Matrix of size [nobs, features].

It's based on an implementation of Linear TreeShap by Yu et al. (2022). It computes exact Shapley values for decision trees in O(LD) time. It was originally ported from this repo.

References

Peng Yu, Chao Xu, Albert Bifet, Jesse Read Linear Tree Shap (2022). In Proceedings of 36th Conference on Neural Information Processing Systems.

source

importance

EvoTrees.importance — Function

importance(model::EvoTree; feature_names=model.info[:feature_names])

Sorted normalized feature importance based on loss function gain. Feature names associated to the model are stored in model.info[:feature_names] as a string Vector and can be updated at any time. Eg: model.info[:feature_names] = new_feature_names_vec.

source