bootstrap | Whether bootstrap samples are used when building trees. If False, the
whole dataset is used to build each tree | default: "False" |
ccp_alpha | Complexity parameter used for Minimal Cost-Complexity Pruning. The
subtree with the largest cost complexity that is smaller than
``ccp_alpha`` will be chosen. By default, no pruning is performed. See
:ref:`minimal_cost_complexity_pruning` for details
.. versionadded:: 0.22 | default: 0.0 |
criterion | | default: "friedman_mse" |
max_depth | The maximum depth of the tree. If None, then nodes are expanded until
all leaves are pure or until all leaves contain less than
min_samples_split samples | default: null |
max_features | | default: 0.4555058146520773 |
max_leaf_nodes | Grow trees with ``max_leaf_nodes`` in best-first fashion
Best nodes are defined as relative reduction in impurity
If None then unlimited number of leaf nodes | default: null |
max_samples | If bootstrap is True, the number of samples to draw from X
to train each base estimator
- If None (default), then draw `X.shape[0]` samples
- If int, then draw `max_samples` samples
- If float, then draw `max_samples * X.shape[0]` samples. Thus,
`max_samples` should be in the interval `(0.0, 1.0]`
.. versionadded:: 0.22 | default: null |
min_impurity_decrease | A node will be split if this split induces a decrease of the impurity
greater than or equal to this value
The weighted impurity decrease equation is the following::
N_t / N * (impurity - N_t_R / N_t * right_impurity
- N_t_L / N_t * left_impurity)
where ``N`` is the total number of samples, ``N_t`` is the number of
samples at the current node, ``N_t_L`` is the number of samples in the
left child, and ``N_t_R`` is the number of samples in the right child
``N``, ``N_t``, ``N_t_R`` and ``N_t_L`` all refer to the weighted sum,
if ``sample_weight`` is passed
.. versionadded:: 0.19 | default: 0.0 |
min_samples_leaf | The minimum number of samples required to be at a leaf node
A split point at any depth will only be considered if it leaves at
least ``min_samples_leaf`` training samples in each of the left and
right branches. This may have the effect of smoothing the model,
especially in regression
- If int, then consider `min_samples_leaf` as the minimum number
- If float, then `min_samples_leaf` is a fraction and
`ceil(min_samples_leaf * n_samples)` are the minimum
number of samples for each node
.. versionchanged:: 0.18
Added float values for fractions | default: 14 |
min_samples_split | The minimum number of samples required to split an internal node:
- If int, then consider `min_samples_split` as the minimum number
- If float, then `min_samples_split` is a fraction and
`ceil(min_samples_split * n_samples)` are the minimum
number of samples for each split
.. versionchanged:: 0.18
Added float values for fractions | default: 17 |
min_weight_fraction_leaf | The minimum weighted fraction of the sum total of weights (of all
the input samples) required to be at a leaf node. Samples have
equal weight when sample_weight is not provided
max_features : {"auto", "sqrt", "log2"}, int or float, default="auto"
The number of features to consider when looking for the best split:
- If int, then consider `max_features` features at each split
- If float, then `max_features` is a fraction and
`round(max_features * n_features)` features are considered at each
split
- If "auto", then `max_features=n_features`
- If "sqrt", then `max_features=sqrt(n_features)`
- If "log2", then `max_features=log2(n_features)`
- If None, then `max_features=n_features`
Note: the search for a split does not stop until at least one
valid partition of the node samples is found, even if it requires to
effectively inspect more than ``max_features`` features | default: 0.0 |
n_estimators | The number of trees in the forest
.. versionchanged:: 0.22
The default value of ``n_estimators`` changed from 10 to 100
in 0.22
criterion : {"squared_error", "absolute_error", "poisson"}, default="squared_error"
The function to measure the quality of a split. Supported criteria
are "squared_error" for the mean squared error, which is equal to
variance reduction as feature selection criterion, "absolute_error"
for the mean absolute error, and "poisson" which uses reduction in
Poisson deviance to find splits
Training using "absolute_error" is significantly slower
than when using "squared_error"
.. versionadded:: 0.18
Mean Absolute Error (MAE) criterion
.. versionadded:: 1.0
Poisson criterion
.. deprecated:: 1.0
Criterion "mse" was deprecated in v1.0 and will be removed in
version 1.2. Use `criterion="squared_error"` which is equivalent
.. deprecated:: 1.0
Criterion "mae" was deprecated in v1.... | default: 100 |
n_jobs | The number of jobs to run in parallel. :meth:`fit`, :meth:`predict`,
:meth:`decision_path` and :meth:`apply` are all parallelized over the
trees. ``None`` means 1 unless in a :obj:`joblib.parallel_backend`
context. ``-1`` means using all processors. See :term:`Glossary
` for more details | default: null |
oob_score | Whether to use out-of-bag samples to estimate the generalization score
Only available if bootstrap=True | default: false |
random_state | Controls both the randomness of the bootstrapping of the samples used
when building trees (if ``bootstrap=True``) and the sampling of the
features to consider when looking for the best split at each node
(if ``max_features < n_features``)
See :term:`Glossary ` for details | default: null |
verbose | Controls the verbosity when fitting and predicting | default: 0 |
warm_start | When set to ``True``, reuse the solution of the previous call to fit
and add more estimators to the ensemble, otherwise, just fit a whole
new forest. See :term:`the Glossary ` | default: false |