The Evolution of English: A Statistical Analysis of Language Change Tracking Linguistic Change
Abstract
This paper examines how statistical analysis techniques, such as Markov chaining, regression analysis, and corpus linguistics methods, can be applied to the evolution of English vocabulary and grammar. Using large historical text databases (e.g. Google Ngram, COHA, EEBO) we analyze lexical frequency changes, grammatical simplifications, and interaction with sociolinguistic factors in language change. Among the key findings are that vocabulary growth shows exponential patterns, grammar tends to become more analytic, and external influences such as technology and migration speed up language change. The work shows how quantitative methods can be and need to be combined with traditional philological methods in historical linguistics.
Keywords: Language change, statistical linguistics, Markov models, regression analysis, corpus linguistics, grammaticalization