![]() |
Search the SiteSearch Member Database |
STATISTICS FOR LINGUISTICSMark DrasThe use of statistics has becoming increasingly common in a range of fields of study, either as a component of new methods or as a way of evaluating these methods; this is certainly true in computational linguistics, but knowledge of statistics is also useful in pure linguistics. This tutorial will have three broad parts. The first part will review some foundational notions in statistics, such as conditional probability, statistical distributions and hypothesis testing. The second part will look at how these can be used in things of interest to linguists, such as determination of collocations (say, the more natural "strong tea" versus "powerful tea") or of distributional similarity (say, finding the words most similar to "tea", according to various measures). The third part will give an overview of tools that might be of interest to linguists, such as morphological analysers, and the statistics behind them; and also of other applications in language technology that use statistics to approximate linguistic intuitions. Materials |