To do this, scientists first input a subset of patient data and adjust their algorithm to reliably distinguish healthy versus control subjects or, in the case of treatment outcomes, responders from non-responders. They can then figure out which features in the data best help the computer “learn,” make sure that their algorithm only incorporates those data features, and validate their method by testing how accurately it can make predictions about the rest of the patients, whose data it has not yet taken into account.