A1 Refereed original research article in a scientific journal
Running WILD: the case for exploring mixed parameter sets in sensitivity analysis
Authors: Sharma PP, Vahtera V, Kawauchi GY, Giribet G
Publisher: WILEY-BLACKWELL
Publication year: 2011
Journal: Cladistics
Journal name in source: CLADISTICS
Journal acronym: CLADISTICS
Volume: 27
Issue: 5
First page : 538
Last page: 549
Number of pages: 12
ISSN: 0748-3007
DOI: https://doi.org/10.1111/j.1096-0031.2010.00345.x
Abstract
The robustness of clades to parameter variation may be a desirable quality or even a goal in phylogenetic analyses. Sensitivity analyses used to assess clade stability have invoked the incongruence length difference (ILD or WILD) metric, a measure of congruence among datasets, to compare a series of most-parsimonious results from re-running analyses under different analytical conditions. It is also common practice to select a single "optimal" parameter set that minimizes WILD across all parameter sets. However, the divergent molecular evolution of ribosomal genes and protein-encoding genes-specifically the bias against transversion events in coding genes of conserved function-suggests that deployment of multiple parameter sets could outperform the use of a single parameter set applied to all molecules. We explored congruence in five published datasets by including mixed parameter sets in our sensitivity analysis. In four cases, mixed parameter sets outperformed the previously reported, single optimal parameter set. Conversely, multiple parameter sets did not outperform a single optimal parameter set in a case in which actual strong topological conflict exists between data partitions. Exploration of mixed parameter sets may prove useful when combining ribosomal and protein-encoding genes, due to the relatively higher frequency of single-and double-base pair indel events in the former, and the relatively lower frequency of transversions in the latter. (C) The Willi Hennig Society 2010.
The robustness of clades to parameter variation may be a desirable quality or even a goal in phylogenetic analyses. Sensitivity analyses used to assess clade stability have invoked the incongruence length difference (ILD or WILD) metric, a measure of congruence among datasets, to compare a series of most-parsimonious results from re-running analyses under different analytical conditions. It is also common practice to select a single "optimal" parameter set that minimizes WILD across all parameter sets. However, the divergent molecular evolution of ribosomal genes and protein-encoding genes-specifically the bias against transversion events in coding genes of conserved function-suggests that deployment of multiple parameter sets could outperform the use of a single parameter set applied to all molecules. We explored congruence in five published datasets by including mixed parameter sets in our sensitivity analysis. In four cases, mixed parameter sets outperformed the previously reported, single optimal parameter set. Conversely, multiple parameter sets did not outperform a single optimal parameter set in a case in which actual strong topological conflict exists between data partitions. Exploration of mixed parameter sets may prove useful when combining ribosomal and protein-encoding genes, due to the relatively higher frequency of single-and double-base pair indel events in the former, and the relatively lower frequency of transversions in the latter. (C) The Willi Hennig Society 2010.