Analysis of Variance for Genomic Sequences in Unbalanced Designs

Número: 
20
Ano: 
2004
Autor: 
Roberta de Souza
Hildete P. Pinheiro
Cibele Q. da Silva
Sérgio F. dos Reis
Abstract: 

In the study of genetic divergence among organisms, generally the analysis is done directly from the DNA molecule. Therefore, a possible outcome is categorical being one out of four categories (looking at the nucleotide level). Light \& Margolin (1971) developed an analysis of variance for categorical data (CATANOVA) and Pinheiro et al. (2000) employed a similar measure of variation and extended the CATANOVA procedure taking into account several positions in the sequence for balanced designs. Here we consider variable number of sequences in each group, that is, the samples are unbalanced. In order to test the null hypothesis of homogeneity among groups, the asymptotic distribution of the test statistic was found and its power is evaluated. An application of the test to real data is illustrated using resampling methods such as the bootstrap to generate the empirical distribution of the test statistics.

Keywords: 
Analysis of variance
Bootstrap
Categorical data
Asymptotic distribution
Molecular data
Statistical genetics
Unbalanced designs
Arquivo: