Text and Language Technology Group

Definition: Document and/or Text Analysis refers to computer-assisted analysis of large numbers of documents in order to answer questions about the content of a document set. The goal of document analysis is to determine content, locate specific documents types, or extract language features without the expense of reading each document in a given set. Reliable document analysis is not a wholly automated process. Initially computers are used to perform large numbers of comparisons which exploit linguistic differences between a norm and an experimental document or a document set. These comparisons are then subject to statistical analysis which assists in reducing the data to a manageable level for interpretation by an experienced linguist. That is, the primary analysis is done by the linguist who uses a computer to reduce data to a interpretable level. Having done the primary analysis, it is often possible to develop more automated algorithms.

Examples: For example, a company may want to classify incoming email according to the writers intent (suggestion, complaint, praise, inquiry). Or, a company may want to routinely scan internal email for liability reasons, extracting potentially damaging or incriminating language for further examination. Professional text analysis provides the means to make these tasks both manageable and practical.

Services Offered: TLTG is prepared to provide text analysis services to industry, legal, and security teams seeking rapid evaluation of large document sets. Our services include the following:


TLTG Home Forensic Doc. Analysis Text Encoding Lexicography Members Services

700 Oglethorpe Ave. •  Athens, Georgia 30606 •  Phone: 706-549-5519 •  Fax: 706-549-1228 •  mail to