How does classifiers handle missing data in training?

  • Assign most common value of A among other examples sorted to node n
  • Assign most common value of A among other examples with same target value
  • Assign probability p_i to each possible value v_i of A; Assign fraction p_i of example to each descendant in tree. [1]


Reference: