might be useful in relation to ongoing work....
==
Nat Genet. 2002 Jun;31(2):180-3. Epub 2002 May 6.
Clustering of housekeeping genes provides a unified model of gene order in the
human genome.
Lercher MJ, Urrutia AO, Hurst LD.
==
It is often supposed that, except for tandem duplicates, genes are randomly
distributed throughout the human genome. However, recent analyses suggest that
when all the genes expressed in a given tissue (notably placenta and skeletal
muscle) are examined, these genes do not map to random locations but instead
resolve to clusters. We have asked three questions: (i) is this clustering true
for most tissues, or are these the exceptions; (ii) is any clustering simply the
result of the expression of tandem duplicates and (iii) how, if at all, does
this relate to the observed clustering of genes with high expression rates? We
provide a unified model of gene clustering that explains the previous
observations. We examined Serial Analysis of Gene Expression (SAGE) data for 14
tissues and found significant clustering, in each tissue, that persists even
after the removal of tandem duplicates. We confirmed clustering by analysis of
independent expressed-sequence tag (EST) data. We then tested the possibility
that the human genome is organized into subregions, each specializing in genes
needed in a given tissue. By comparing genes expressed in different tissues, we
show that this is not the case: those genes that seem to be tissue-specific in
their expression do not, as a rule, cluster. We report that genes that are
expressed in most tissues (housekeeping genes) show strong clustering. In
addition, we show that the apparent clustering of genes with high expression
rates is a consequence of the clustering of housekeeping genes.