Hostname: page-component-848d4c4894-nr4z6 Total loading time: 0 Render date: 2024-05-01T01:45:12.025Z Has data issue: false hasContentIssue false

Artificial intellugence and knowledge based systems in molecular biology*

Published online by Cambridge University Press:  07 July 2009

John Fox
Affiliation:
Advanced Computation Laboratory
Christopher J. Rawlings
Affiliation:
Biomedical Informatics Unit, Imperial Cancer Research Fund, PO Box 123, Loncoln's Inn Fields, London WC2A 3PX, UK

Abstract

Over the last ten years, molecular biologists and computer scientists have experimented with various artificial intelligence techniques, notably knowledge based and expert systems, qualitative simulation, natural language processing and various machine learning techniques. These techniques have been applied to problems in molecular data analysis, construction of advanced databases and modelling of biological systems. Practical results are now being obtained, notably in the representation and recognition of genetically significant structures, the assembly of genetic maps and prediction of the structure of complex molecules such as proteins. The paper outlines the principal methods used, surveys the findings to date, and identifies promising trends and current limitations.

Type
Research Article
Copyright
Copyright © Cambridge University Press 1994

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aaronson, JS, Haas, J and Overton, GC, 1993. “Knowledge discovery in GenBank”. In: Hunter, L, Searls, D and Shavlik, J (eds.), Proceedings of First International Conference on Intelligent Systems for Molecular Biology, pp 311. AAAI Press, WA.Google Scholar
Abarbanel, RM, Wieneke, PR, Jaffe, DA and Brutlag, DL, 1984. “Rapid searches for complex patterns in biological molecules”. Nucleic Acids Research 12 263280.CrossRefGoogle ScholarPubMed
Baehr, A, Dunham, G, Ginsburg, A, Hagstrom, R, 1992. An integrated database to support research on Escherichia coli. Technical Report ANL-92/1, Argonne National Laboratory.Google Scholar
Bairoch, A, 1991. “PROSITE: a dictionary of sites and patterns in proteins”. Nucleic Acids Research 19 22412245.CrossRefGoogle ScholarPubMed
Barton, GJ and Rawlings, CJ, 1990. “A PROLOG approach to analysing protein structure”. Tetrahedron Computer Methodology 3 739756.CrossRefGoogle Scholar
Bergadano, F and Gunetti, D, 1994. “Learning relations and logic programs”. The Knowledge Engineering Review 9 (1) 7377.CrossRefGoogle Scholar
Blum, RL, 1982. “Discovery, confirmation and incorporation of causal relationships from a large time- oriented clinical database: the RX project”. Computers in Biomedical Research 15 164187.CrossRefGoogle Scholar
Brunak, S, Engelbrecht, J and Knudsen, S, 1991. “Neural network detects errors in the assignment of m RNA splice sites”. Nucleic Acids Research 18 47974801.CrossRefGoogle Scholar
Brutlag, DG, Galper, AR and Millis, DH, 1991. “Knowledge-based simulation of DNA metabolism: prediction of enzyme action”. Computer Applications in the Biosciences 7 919.Google ScholarPubMed
Carhart, RE, Cash, HD and Moore, JF, 1988. “StratGene: object-oriented programming for molecular biology”. Computer Applications in the Biosciences 4 205212.Google Scholar
Cattell, RGG, 1991. Object Data Management: Object-oriented and extended relational database systems. Addison-Wesley.Google Scholar
Clark, DA, Doursenot, S and Rawlings, CJ, 1994. “Genetic map construction with constraints”. In: Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology, AAAI Press, WA (in press).Google Scholar
Clark, DA, Rawlings, CJ, Barton, GJ and Archer, I, 1990. “Knowledge-based orchestration of protein sequence analysis and knowledge acquisition for protein structure prediction”. In: Proceedings AAAI Spring Symposium, 2832.Google Scholar
Clark, DA, Rawlings, CJ, Shirazi, J, Veron, A and Reeve, M, 1993. “Protein topology prediction through parallel constraint logic programming”. In: Hunter, L, Searls, D and Shavlik, J (eds.), Proceedings of First International Conference on Intelligent Systems for Molecular Biology, pp 8391. AAAI Press, WA.Google Scholar
Clark, DA, Shirazi, J and Rawlings, CJ, 1991. “Protein topology prediction through constraint-based search and the evaluation of topological folding rules”. Protein Engineering 4 751761.CrossRefGoogle ScholarPubMed
Clocksin, WF and Mellish, CS, 1981. Programming in Prolog. Springer-Verlag.Google Scholar
Craven, MW and Shavlik, JW, 1993. “Learning to predict reading frames in E. coli sequences”. In: Hunter, L (ed), Proceedings of 26th Hawaii International Conference on Systems Science –Biotechnology, pp 773782. IEEE Computer Society.Google Scholar
Doursenot, S, Clark, DA, Rawlings, CJ and Veron, A, 1993. “Contig mapping using ElipSys”. In: Proceedings of Al and the Genome Workshop, 13th International Joint Conference on Artificial Intelligence Chambery, France.Google Scholar
Edwards, P, Sleeman, D, Roberts, GCK and Yun-Lian, L, 1993. “An Al approach to the interpretation of the NMR spectra of proteins”. In: Hunter, L (ed), Al and Molecular Biology, pp 396432. AAAI Press.Google Scholar
Friedland, P and Kedes, LH, 1985. “Discovering the secrets of DNA”. Communications of the ACM 28 11641186.CrossRefGoogle Scholar
Friedland, P and Iwasaki, Y, 1985.“The concept and implementation of skeletal plans”. Journal of Autonated Reasoning 1 161208.Google Scholar
Friedland, P, Kedes, L, Brutlag, DL, Iwasaki, Y and Bach, R, 1982. GENESIS: a knowledge based genetic engineering simulation system for representation of genetic data and experiment planning”. Nucleic Acids Research 10 323340.CrossRefGoogle ScholarPubMed
Couy, M, 1989. “Secondary structure prediction of RNA”. In: Bishop, MJ and Rawlings, CJ (eds), Nucleic acid and protein sequence analysis Practical Approach, pp 259284. IRL Press.Google Scholar
Gray, PMD, Paton, NW, Kemp, GJL and Fothergill, JEF, 1990. “An object-oriented database for protein structure analysis”. Protein Engineering 3 235243.CrossRefGoogle ScholarPubMed
Hagstrom, R, Michaels, OS, Overbeek, R, 1992. GenoGraphics for Open windows. Technical Report ANL-92/11, Argonne National Laboratory.Google Scholar
Helgeson, C and Sibbald, PR, 1993. “PALM–a pattern language for molecular biology”. In: Hunter, L, Searls, D and Shavlik, J (eds), Proceedings of First International Conference on Intelligent Systems for Molecular Biology, pp 172180. AAAI Press, WA.Google Scholar
Heuze, P, 1989. RS2P RNA secondary structure prediction in ElipSys. Technical Report ElipSys/10, European Computer-Industry Research Centre, Ababellastrasse 17, D8000 Munich 81, Germany.Google Scholar
Hirst, J and Sternberg, MJE, 1992. “Prediction of the structural and functional features of protein and nucleic acid sequences by artificial neural network”. Biochemistry 31 72117218.CrossRefGoogle Scholar
Hunter, L, 1993. Artificial Intelligence in Molecular Biology. AAAI Press/MIT Press.Google Scholar
Hunter, L, Searls, D and Shavlik, J, (eds), 1993. Proceedings of First International Conference on Intelligent Systems for Molecular Biology. AAAI Press.Google Scholar
Jiang, K, Zheng, J, Higgins, SB, 1990. “A knowledge-based experimental design system for nucleic acid engineering”. Computer Applications in the Biosciences 6 205212.Google ScholarPubMed
Karp, P, 1992. “A large knowledge-base of bacterial genes and metabolism”. In: Proceedings of AAAI Workshop on Communicating Scientific and Technical Thinking, pp 133137. AAAI Press.Google Scholar
Karp, P, 1993. “A qualitative biochemistry and its application to the regulation of the tryptophan operon”. In: Hunter, L (ed), Al and Molecular Biology, pp 289323. AAAI Press.Google Scholar
Karp, P and Riley, M, 1993. “Representations of metabolic knowledge”. In: Hunter, L, Searls, D and Shavlik, J. (eds), Proceedings of First International Conference on Intelligent Systems for Molecular Biology, pp 207215. AAAI Press, WA.Google Scholar
Kazic, T, 1993. “Representation, reasoning and the intermediary metabolism of Escherichia coli”. In: Hunter, L (ed), Proceedings of 26th Hawaii International Conference on Systems Science – Biotechnology, pp 853862. IEEE Computer Society.Google Scholar
Koile, K and Overton, GC, 1989. “A qualitative model for gene expression”. In: Proceedings of the 1989 Summer Computer Simulation Conference. Society for Computer Simulation.Google Scholar
Koton, PA, 1985. Towards a Problem Solving System for Molecular Genetics. MIT Laboratory of Computer Science, Technical Report MIT/LCS/TR-338.Google Scholar
Kumar, V, 1992. “Algorithms for constraint satisfaction problems: a survey. AI Magazine 13 3244.Google Scholar
Lathrop, R, Webster, TA and Smith, TF, 1987. “ARIADNE: Pattern-directed inference and hierarchical abstraction in protein structure”. Communications of the ACM 30 909921.CrossRefGoogle Scholar
Lenat, DB, 1983. “The role of heuristics in learning by discovery: Three case studies”. In: Michaiski, RS, Carbonell, JG and Mitchell, TM (eds), Machine Learning: an artificial intelligence approach, pp 243306. Tioga Press.Google Scholar
Letovsky, S and Berlyn, MB, 1992. “CPROP: A rule-based program for constructing genetic maps”. Genomics 12 435446.CrossRefGoogle ScholarPubMed
Lindsay, RK, Buchanan, BG, Feigenbaum, EA and Lederberg, J, 1980. Applications of Artificial Intelligence for Organic Chemistry: the DENDRAL project. McGraw-Hill.Google Scholar
Lyall, A, Hammond, P, Brough, D and Glover, D. 1984. “BIOLOG–a DNA sequence analysis system in Prolog”. Nucleic Acids Research 12 633642.CrossRefGoogle ScholarPubMed
Major, F, Turcotte, M, Gautheret, D, Lapalme, G and Cedergren, R, 1991. “The combination of symbolic and numerical computation for three-dimensional modeling of RNA”. Science 253 12551260.CrossRefGoogle ScholarPubMed
Mavrovouniotis, ML, 1993a. “Identification of qualitatively feasible metabolic pathways”. In: Hunter, L (ed), Al and Molecular Biology, pp 325364. AAAI Press.Google Scholar
Mavrovouniotis, ML, 1993b.“Identification of localized and distributed bottlenecks in metabolic pathways”. In: Hunter, L, Searis, D and Shavlik, J (eds), Proceedings of First International Conference on Intelligent Systems for Molecular Biology, pp 275283. AAAI Press.Google Scholar
Meyers, S and Friedland, P, 1984. “Knowledge-based Simulation of Genetic Regulation in Bacteriophage lambda”. Nucleic Acids Research 12 19.CrossRefGoogle ScholarPubMed
Minsky, M and Papert, S, 1969. Perceptrons. MIT Press.Google Scholar
Mott, R, Grigoriev, A, Maier, E, Hoheisel, J and Lehrach, H, 1993. “Algorithms and software tools for ordering clone libraries: applications to the mapping of the genome of Schizosaccharomyces pombe”. Nucleic Acids Research 21 19651974.CrossRefGoogle Scholar
Muggleton, S and Feng, C, 1990. “Efficient induction of logic programs”. In: Arikawa, S, Goto, S, Ohsuga, S and Yokomosi, T (eds), Proceedings 1st Conference on Algorithmic Learning Theory, pp 368381. Japanese Society for Artificial Intelligence.Google Scholar
Muggleton, S, King, RD and Sternberg, MJE, 1992. “Protein secondary structure prediction using logic”. Protein Engineering 5, 647657.CrossRefGoogle ScholarPubMed
Mural, RJ, Einstein, JR, Guan, X, Mann, RC and Uberbacher, EC, 1992. “An artificial intelligence approach to DNA sequence feature recognition”. Trends Biotechnol 10 6669.CrossRefGoogle ScholarPubMed
Overton, GC, Koile, K and Pastor, J, 1990. “GeneSys: A knowledge management system for molecular biology”. In: Bell, G and Marr, T (eds), Computers and DNA SF1 Studies in the Sciences of Complexity, pp 213239. Addison-Wesley.Google Scholar
Pereira, FCN and Warren, DHD, 1980. “Definite clause grammars for language analysis”. Artificial Intelligence 13 231278.CrossRefGoogle Scholar
Presnell, SR and Cohen, FE, 1993. “Artificial neural networks for pattern recognition in biochemical sequences”. Ann Rev Biophys Biomol Struct 22 283298.CrossRefGoogle ScholarPubMed
Qian, N and Sejnowski, TJ, 1988. “Predicting the secondary structure of globular proteins using neural network models”. J Molecular Biology 202 865884.CrossRefGoogle ScholarPubMed
Rawlings, CJ, Taylor, WR, Nyakairu, J, Fox, J and Sternberg, MJE, 1985. “Reasoning about protein topology using the logic programming language PROLOG”. Journal of Molecular Graphics 3 151157.CrossRefGoogle Scholar
Searls, DB and Liebowitz, S, 1990. “Logic grammars as a vehicle for syntactic pattern recognition”. In: Proceedings of Workshop on Syntactic and Structural Pattern Recognition, pp 402422. International Association for Pattern Recognition.Google Scholar
Scans, DB, 1993. “The computational linguistics of biological sequences”. In: Hunter, L (ed), Artificial Intelligence in Molecular Biology, pp 47120. AAAI Press, CA.Google Scholar
Shavlik, JW, Towell, GG and Noordewier, MO, 1992. “Using neural networks to refine existing biological knowledge”. International Journal of Genome Research 1 81107.Google Scholar
Stefik, M, 1978. “Inferring DNA structures from segmentation data”. Artificial Intelligence 11 85114.CrossRefGoogle Scholar
Stefik, M, 1981. “Planning with constraints [MOLGEN: Part 1]Artificial Intelligence 16 111140.CrossRefGoogle Scholar
Sternberg, MJE, King, RD, Lewis, RA and Muggleton, S, 1994. “Application of machine learning to structural molecular biology” Phil. Trans. Roy. Soc. London (B) (to appear).Google Scholar
Stormo, GD, Schneider, TD, Gold, L and Ehrenfuecht, A, 1982. “Use of the perception algorithm to distinguish translational initiation sites in E. coli”. Nucleic Acids Research 10 29973011.CrossRefGoogle Scholar
Van Hentenryck, P, 1991. “Constraint logic programming”. The Knowledge Engineering Review 6 151194.CrossRefGoogle Scholar
Weiss, SW and Kulikowski, CA, 1991. Computer Systems That Learn. Morgan-Kaufman.Google Scholar
Weld, DS, 1984. Switching between discrete and continuous process models to predict genetic activity. MIT Artificial Intelligence Laboratory. Technical Report 793.Google Scholar
Yoshida, K, Smith, C, Kazic, T, 1992. “Toward a human genome encyclopedia”. In: Proceedings of the International Conference on Fifth Generation Computer Systems, pp 307319. ICOT.Google Scholar