A New Formulation of Zipf’s Meaning-Frequency Law through Contextual Diversity This paper proposes formulating Zipf’s meaning-frequency law, the power law between word frequency and the number of meanings, as a…
Featured
Clustering is a fundamental technique in machine learning and data mining, offering a powerful lens to understand self-organizing patterns in the real world. At its core, clustering is inherently information-theoretic:…
The Strahler number was originally proposed to characterize the complexity of river bifurcation and has found various applications. This article proposes computation of the Strahler number’s upper and lower limits…
The correlation dimension of natural language is measured by applying the Grassberger-Procaccia algorithm to high-dimensional sequences produced by a large-scale language model. This method, previously studied only in a Euclidean…
The theory of econophysics reveals the scaling properties of price, which explains why market crashes much more frequently than expected. A challenge of financial market modeling is to characterize the…
How grammatically complex are adults’ utterances as compared with those of children? How is a literary text more structurally complex than a Wikipedia source? How can such complexity be compared…
State-of-the-art word embedding methods represent a word with a single vector and presume a linear vector space, which does not easily incorporate nonlinearity that is necessary to, for example, represent…