Dennis Bahler and B. Stone, Neural Models and Extracted Rules for Knowledge Discovery in Predictive Toxicology.

We are using neural networks as a tool for predicting chemically-induced carcinogenesis in rodents by training on data derived from a long series of expensive and time-consuming animal tests. Neural networks have shown to be a capable model for accomplishing this task, providing results as good or better than other approaches to the same problem. A new approach to relevant feature subset selection is presented which uses the connection weights of a trained network to assign relevance weights to the attributes; a threshold is then determined by hill climbing. Our Single Hidden Unit Method is shown to provide good results in reasonable time compared with other feature selection methods. Once a network was trained, its weight matrix was pruned in anticipation of rule extraction. Our iterative method is shown to be capable of pruning roughly three-fourths of the connections while improving accuracy. Finally, rule extraction is investigated as a means for networks to explain themselves. A brute force approach to rule extraction in which all possible inputs are listed as rules and the rules are then collapsed to M-of-N rules is shown to build a reasonably small rule set that only suffers a small drop in accuracy from the neural network. An algorithm is presented for the brute force approach which allows it to finish in reasonable time. The set of 22 M-of-N rules so derived are readable and useful for describing the knowledge learned by the network in terms that humans can understand. By applying these new tools to the field of predictive toxicology, a network is trained that is estimated to have good predictive accuracy relative to other efforts in this field. In addition, the results from feature selection and the extracted rules provide new information to predictive toxicologists that is interesting because of the new approach, provocative results, and potential for pointing the way toward new insights in the field.
(Temporarily offline)