|
A Definition of Document Analysis
Document and/or Text Analysis refers to computer-assisted analysis of large
numbers of documents in order to answer questions about the content of a
document set. The goal of document analysis is to determine content, locate
specific documents types, or extract language features without the expense of
reading each document in a given set. Reliable document analysis is not a
wholly automated process. Initially computers are used to perform large numbers
of comparisons which exploit linguistic differences between a norm and an
experimental document or a document set. These comparisons are then subject to
statistical analysis which assists in reducing the data to a manageable level
for interpretation by an experienced linguist. That is, the primary analysis is
done by the linguist who uses a computer to reduce data to a interpretable
level. Having done the primary analysis, it is often possible to develop more
automated algorithms.
Example >>
|