Analysis of Variance for Binary Data in Unbalanced Designs

Roberta de Souza
Hildete P. Pinheiro
Cibele Q. da Silva
Sérgio F. dos Reis

In the study of genetic divergence among organisms, generally the analysis is done directly from the DNA molecule. Therefore, a possible outcome is binary (dominant or recessive phenotype). Comparison of groups of molecular data is a great interest in molecular genetics and evolutionary biology. Some work have been done on analysis of variance for genetic data (Weir, 1990; Pinheiro et al., 2000; Pinheiro et al., 2001; Pinheiro et al., 2002 and others). Weir (1990) proposed a genetic diversity measure, the heterozygosity, and developed an analysis of variance for binary data in a balanced design.Here, we extend the work of Weir developing an analysis of variance for binary data with the purpose of comparing groups in unbalanced designs. In order to test the null hypothesis of homogeneity among groups, the asymptotic distribution of the test statistic was found. An application of the test to real data is illustrated using resampling methods such as the bootstrap to generate the empirical distribution of the test statistics.

Analysis of variance
Binary data
Asymptotic distribution
Molecular data
Statistical genetics