The genetic basis of many human diseases, especially those with
substantial genetic determinants,
has been identified. Notable amongst others are cystic fibrosis,
Huntington's disease and some forms
of cancer. However, the detection of genetic factors with more
modest effects such as in bipolar
disorders and a majority of the cancers, has been more complicated. Standard
linkage analysis
procedures may not only have little power to detect such genes but they
do,
at best, only narrow the
location of the disease susceptibility gene to a rather large region.
Association studies are therefore
necessary to further unveil the aetiological relevance of these factors
to disease. However, the number
of tests required if such procedures were used in extended genome-wide
screens, is prohibitive and
as such association studies have seen limited application, except in
the investigation of candidate
genes. In this paper, we discuss a logistic regression approach as a
generalization of this procedure so
that it can accommodate clusters of linked markers or candidate genes.
Furthermore, we introduce an expectation maximization (E–M) algorithm
with which to estimate haplotype frequencies for
multiple locus systems with incomplete information on phase.