Published online by Cambridge University Press: 08 September 2016
We propose a simple model for interaction between gene candidates in the two strands of bacterial DNA (deoxyribonucleic acid). Our model assumes that ‘final’ genes appear in one of the two strands, that they do not overlap (in bacteria there is only a small percentage of overlap), and that the final genes maximize the occupancy rate, which is defined to be the proportion of the genome occupied by coding zones. We are more concerned with describing the organization and distribution of genes in bacterial DNA than with the very hard problem of identifying genes. To this end, an algorithm for selecting the final genes according to the previously outlined maximization criterion is proposed. We study the graphical and probabilistic properties of the model resulting from applying the maximization procedure to a Markovian representation of the genic and intergenic zones within the DNA strands, develop theoretical bounds on the occupancy rate (which, in our view, is a rather intractable quantity), and use the model to compute quantities of relevance to the Escherichia coli genome and compare these to annotation data. Although this work focuses on genomic modelling, we point out that the proposed model is not restricted to applications in this setting. It also serves to model other resource allocation problems.