The structure of a protein can be considered at four levels. The primary structure comprising the sequence of amino acids in the chain(s) is the subject of this chapter. Secondary, tertiary and quaternary structures are described in Chapter 2.
Although the determination of the primary structure of insulin by Sanger in the early 1950s evoked great excitement and earned him the first of two Nobel prizes, some of this chapter is largely of historical interest since Sanger earned himself a second Nobel price by developing a rapid method for sequencing the DNA that codes for proteins. Only twenty amino acids are coded for by DNA (see Chapter 8); related amino acids may arise in peptides and proteins by post-translational modification. Consequently, determination of the primary structure from the DNA sequence does not provide information about post-translational modification and these details must be determined by the classical methods of amino-acid sequencing described in this chapter. Emil Fischer's suggestion at the beginning of the twentieth century that proteins are composed of amino acids linked through peptide bonds (–CONH–), in which the –CO– and –NH– moieties originate from the carboxy and amino groups of consecutive amino acids, has been fully vindicated by synthetic, degradative and X-ray crystallographic techniques. Other covalent bonds link amino-acid residues in peptides and proteins. The commonest is the disulphide bond of cystine, which is formed by oxidation of the thiol groups of two cysteine residues.