Untitled

Command-line output:
====================

Decision tree

  Train and evaluate using a decision tree.  Given a dataset containing numeric
  features and associated labels for each point in the dataset, this program can
  train a decision tree on that data.

  The training file and associated labels are specified with the
  '--training_file' and '--labels_file' parameters, respectively.  The labels
  should be in the range [0, num_classes - 1]. Optionally, if '--labels_file' is
  not specified, the labels are assumed to be the last dimension of the training
  dataset.

  When a model is trained, the '--output_model_file' output parameter may be
  used to save the trained model.  A model may be loaded for predictions with
  the '--input_model_file' parameter.  The '--input_model_file' parameter may
  not be specified when the '--training_file' parameter is specified.  The
  '--minimum_leaf_size' parameter specifies the minimum number of training
  points that must fall into each leaf for it to be split.  If
  '--print_training_error' is specified, the training error will be printed.

  Test data may be specified with the '--test_file' parameter, and if
  performance numbers are desired for that test set, labels may be specified
  with the '--test_labels_file' parameter.  Predictions for each test point may
  be saved via the '--predictions_file' output parameter.  Class probabilities
  for each prediction may be saved with the '--probabilities_file' output
  parameter.

  For example, to train a decision tree with a minimum leaf size of 20 on the
  dataset contained in 'data.csv' with labels 'labels.csv', saving the output
  model to 'tree.bin' and printing the training error, one could call

  $ decision_tree --training_file data.csv --labels_file labels.csv
    --output_model_file tree.bin --minimum_leaf_size 20 --print_training_error

  Then, to use that model to classify points in 'test_set.csv' and print the
  test error given the labels 'test_labels.csv' using that model, while saving
  the predictions for each point to 'predictions.csv', one could call

  $ decision_tree --input_model_file tree.bin --test_file test_set.csv
    --test_labels_file test_labels.csv --predictions_file predictions.csv

Optional input options:

  --help (-h) [bool]            Default help info.
  --info [string]               Get help on a specific module or option.
                                Default value ''.
  --input_model_file (-m) [string]
                                Pre-trained decision tree, to be used with test
                                points.  Default value ''.
  --labels_file (-l) [string]   Training labels.  Default value ''.
  --minimum_leaf_size (-n) [int]
                                Minimum number of points in a leaf.  Default
                                value 20.
  --print_training_error (-e) [bool]
                                Print the training error.
  --test_file (-T) [string]     Matrix of test points.  Default value ''.
  --test_labels_file (-L) [string]
                                Test point labels, if accuracy calculation is
                                desired.  Default value ''.
  --training_file (-t) [string]
                                Matrix of training points.  Default value ''.
  --verbose (-v) [bool]         Display informational messages and the full list
                                of parameters and timers at the end of
                                execution.
  --version (-V) [bool]         Display the version of mlpack.

Optional output options:

  --output_model_file (-M) [string]
                                Output for trained decision tree.  Default value
                                ''.
  --predictions_file (-p) [string]
                                Class predictions for each test point.  Default
                                value ''.
  --probabilities_file (-P) [string]
                                Class probabilities for each test point.
                                Default value ''.

For further information, including relevant papers, citations, and theory,
consult the documentation found at http://www.mlpack.org or included with your
distribution of mlpack.

==========================

Python binding output:
======================

>>> help(decision_tree)
Help on built-in function decision_tree in module mlpack.decision_tree:

decision_tree(...)
    Decision tree

    Train and evaluate using a decision tree.  Given a dataset containing numeric
    features and associated labels for each point in the dataset, this program can
    train a decision tree on that data.

    The training file and associated labels are specified with the 'training' and
    'labels' parameters, respectively.  The labels should be in the range [0,
    num_classes - 1]. Optionally, if 'labels' is not specified, the labels are
    assumed to be the last dimension of the training dataset.

    When a model is trained, the 'output_model' output parameter may be used to
    save the trained model.  A model may be loaded for predictions with the
    'input_model' parameter.  The 'input_model' parameter may not be specified
    when the 'training' parameter is specified.  The 'minimum_leaf_size' parameter
    specifies the minimum number of training points that must fall into each leaf
    for it to be split.  If 'print_training_error' is specified, the training
    error will be printed.

    Test data may be specified with the 'test' parameter, and if performance
    numbers are desired for that test set, labels may be specified with the
    'test_labels' parameter.  Predictions for each test point may be saved via the
    'predictions' output parameter.  Class probabilities for each prediction may
    be saved with the 'probabilities' output parameter.

    For example, to train a decision tree with a minimum leaf size of 20 on the
    dataset contained in 'data' with labels 'labels', saving the output model to
    'tree' and printing the training error, one could call

    >>> decision_tree(training=data, labels=labels, minimum_leaf_size=20,
      print_training_error=True)
    >>> tree = output['output_model']

    Then, to use that model to classify points in 'test_set' and print the test
    error given the labels 'test_labels' using that model, while saving the
    predictions for each point to 'predictions', one could call

    >>> decision_tree(input_model=tree, test=test_set, test_labels=test_labels)
    >>> predictions = output['predictions']


    Parameters:

     - input_model (DecisionTreeModelType): Pre-trained decision tree, to be
          used with test points.
     - labels (row vector): Training labels.
     - minimum_leaf_size (int): Minimum number of points in a leaf.
     - print_training_error (bool): Print the training error.
     - test (matrix): Matrix of test points.
     - test_labels (row vector): Test point labels, if accuracy calculation
          is desired.
     - training (matrix): Matrix of training points.