DGP Classification using Stochastic Imputation
This vignette gives a demonstration of the package on classifying the popular iris data set (Anderson 1935).
We start by loading required packages,
We now load the iris data set,
and do a min-max normalization on its four input variables.
Before building a classifier, we set a seed with
set_seed()
from the package for reproducibility
and split a training data set and a testing data set:
We consider a three-layer DGP classifier, using a Matérn-2.5 kernel in the first layer and a squared exponential kernel in the second layer:
m_dgp <- dgp(X_train, Y_train, depth = 3, name = c('matern2.5', 'sexp'), likelihood = "Categorical")
## Auto-generating a 3-layered DGP structure ... done
## Initializing the DGP emulator ... done
## Training the DGP emulator:
## Iteration 500: Layer 3: 100%|██████████| 500/500 [00:31<00:00, 15.63it/s]
## Imputing ... done
We set likelihood = "Categorical"
since the DGP
classifier is essentially a DGP emulator with a categorical
likelihood.
We are now ready to validate the classifier via
validate()
at 30 out-of-sample testing positions:
## Initializing the OOS ... done
## Calculating the OOS ... done
## Saving results to the slot 'oos' in the dgp object ... done
Finally, we visualize the OOS validation for the classifier:
## Validating and computing ... done
## Post-processing OOS results ... done
## Plotting ... done
By default, plot()
displays true labels against
predicted label proportions at each input position. Alternatively,
setting style = 2
in plot()
generates a
confusion matrix:
## Validating and computing ... done
## Post-processing OOS results ... done
## Plotting ... done