The primary goal of the field of bioinformatics is to examine the biologically important information that is stored, used, and transferred by living things, and how this information acts to control the chemical environment within living organisms. This work has led to technological successes such as the rapid development of potent HIV-1 proteinase inhibitors, the development of new hybrid seeds or genetic variations for improved agriculture, and even to new understanding of pre-historical patterns of human migration. Before discussing useful algorithms for bioinformatic analysis, we must first cover some preliminary concepts that will allow us to speak a common language.
Deoxyribonucleic acid (DNA) is the genetic material that is passed down from parent to offspring. DNA is stored in the nuclei of cells and forms a complete set of instructions for the growth, development, and functioning of a living organism. As a macromolecule, DNA can contain a vast amount of information through a specific sequence of bases: guanine (G), adenine (A), thymine (T), and cytosine (C). Each base is attached to a phosphate group and a deoxyribose sugar to form a nucleotide unit. The four different nucleotides are then strung into long polynucleotide chains, which comprise genes that are thousands of bases long, and ultimately into chromosomes.