The dataset class provides methods available through project_policy.yaml to manipulate the ingested data so that it is suitable for processing.
Here are operations that can be performed on the dataset columns:
Creates new column(s) with one hot encoded values. This is useful when you have more than two result types in a column.
You need to specify a column that is used as the source for creating one-hot-encoded columns. Note that this specified column is not updated.
Values in the column are listed that should be used to create new one-hot-encoded columns. Note that the value is used as the column name.
Be careful to avoid column name collisions.
Example:- one_hot_encode: - column: class - values: - Iris-setosa - Iris-versicolor - Iris-virginica
Sets what columns are used as output data from dataset (i.e. what columns contain the expected answer(s) Pass it a list of output column names
Example:- set_output_columns: - Iris-setosa - Iris-versicolor - Iris-virginica