| Number of probes | Number of genes | R2 | RMSE |
---|
Random Forest procedure |
FCR | 604 | 411 | 0.42 | 0.366 |
100 | 58 | 0.62 | 0.301 |
50 | 30 | 0.65 | 0.293 |
25 | 17 | 0.67 | 0.281 |
10 | 8 | 0.68 | 0.278 |
Gradient Tree Boosting |
FCR | 728 | 477 | 0.78 | 0.241 |
100 | 56 | 0.79 | 0.235 |
50 | 27 | 0.80 | 0.234 |
25 | 12 | 0.81 | 0.229 |
10 | 5 | 0.80 | 0.223 |
- Random forest (RF) or gradient treenet boosting (GTB) algorithms were applied on a transcriptomic dataset containing 26,687 molecular probes measured in whole blood sampled from 148 pigs. Dataset was split into training (n = 74) and validation test (n = 74) subsets to evaluate model performance in predicting food conversion ratio (FCR). The first rounds led to model stabilization with 604 molecular probes as very important variables (VIP) for FCR prediction using RF and 728 probes for FCR prediction with GTB, respectively, out of the 26,687 expressed annotated probes. The second entry was an iterative step of the former procedure, but considering the VIP identified in the first step as the new inputs. This increased the accuracy of the prediction evaluated by the root mean square error (RMSE) and the coefficient of determination (R2). Iterative steps were further performed. The numbers of annotated probes and their corresponding unique genes identified as VIP were indicated at each step. Iterative models were almost equivalent in performance, so that the ones including 27–30 unique genes were further selected. Models obtained with GTB algorithms performed better than those obtained by using RF procedures