We use 70% of this reduced data set to train a logistic regression model that predicts the probability of borrowers defaulting using the following features: loan amount (loan_amnt), monthly installment (installment), annual income (annual_inc), debt-to-income ratio (dti), revolving balance (revol_bal), incidences of delinquency (delinq_2yrs), number of open credit lines (open_acc), number of derogatory public records (pub_rec), upper boundary range of FICO score (fico_range_high), lower boundary range of FICO score (fico_range_low), revolving line utilization rate (revol_util), and months of credit history (cr_hist). The model is used by a system that denies credit to loan applicants with a probability of default above 20%. We then use the system to decide which of the held-out 30% of loans should be approved.