The Strahler number was originally proposed to characterize the complexity of river bifurcation and has found various applications. This article proposes computation of the Strahler number’s upper and lower limits…
computational linguistics
The correlation dimension of natural language is measured by applying the Grassberger-Procaccia algorithm to high-dimensional sequences produced by a large-scale language model. This method, previously studied only in a Euclidean…
We explore the complexity underlying human symbolic sequences via entropy rate estimation. Consider the number of possibilities for a time series of length n, with a parameter h, as 2_hn_. For a…
Real instances of social systems have a bursty character, meaning that events occur in a clustered manner. For example, the figure below shows how rare events occur over time in…
Various metrics are considered in terms of whether they characterize different kinds of data. For example, in the case of natural language, metrics that specify the author, language, or genre…
A generative model is a mathematical formulation that generates a sample similar to real data. Many such models have been proposed using machine learning methods, including deep learning. Study of…
For mathematical models of language, their potential, limitations, and ways of improvement are investigated in terms of whether they reproduce the complex properties of language. The nature of linguistic structure…