In Silico Knowledge and Content Tracking
This chapter gives a brief overview of text-mining techniques to extract knowledge from large text collections. It describes the basis pipeline of how to come from text to relationships between biological concepts and the problems that are encountered at each step in the pipeline. We first explain how words in text are recognized as concepts. Second, concepts are associated with each other using 2�2 contingency tables and test statistics. Third, we explain that it is possible to extract indirect links between concepts using the direct links taken from 2�2 table analyses. This we call implicit information extraction. Fourth, the validation techniques to evaluate a text-mining system such as ROC curves and retrospective studies are discussed. We conclude by examining how text information can be combined with other non-textual data sources such as microarray expression data and what the future directions are for text-mining within the Internet.
- WebACT: An Online Genome Comparison Suite
- SNP Genotyping by the -Nuclease Reaction
- Mutation Surveyor: Software for DNA Sequence Analysis
- RNAi and Overexpression of Genes in Ovarian Somatic Cells
- Pharmacology of Mammalian Olfactory Receptors
- Restriction Enzyme Analysis of PCR Products
- Fabrication of Porous Polymer Monoliths in Microfluidic Chips for Selective Nucleic Acid Concentration and Purification
- Sequencing: A Technical Overview
- Gene Identification Using the Pufferfish, Fugu rubripes, by Sequence Scanning
- Genomics Studies of Immune-Mediated Diseases Using the BNLEW Rat Model