1. Bootstrapped dataset: Randomly select rows with replacement
  2. Create tree with randomly selected columns at each node
    1. Number of variables to select: Try square root of # of variables
  3. Track out of bag error

Proximity matrix

In the proximity matrix, higher values indicate greater similarity or closeness between pairs of samples. Samples that are frequently grouped together in the same leaf nodes across different decision trees tend to have higher proximity values