A New Formulation of Zipf’s Meaning-Frequency Law through Contextual Diversity This paper proposes formulating Zipf’s meaning-frequency law, the power law between word frequency and the number of meanings, as a…
Machine learning
Clustering is a fundamental technique in machine learning and data mining, offering a powerful lens to understand self-organizing patterns in the real world. At its core, clustering is inherently information-theoretic:…
This paper shows a novel machine learning model for realized volatility (RV) prediction using a normalizing flow, an invertible neural network. Since RV is known to be skewed and have a…
We apply an information-theoretic perspective to reconsider generative document retrieval (GDR), in which a document x∈X is indexed by t∈T, and a neural autoregressive model is trained to map queries Q to T. GDR…
Templates are multi-word expressions with slots, such as “Starting at _ on _ ” or “regard _ as _”, that appear frequently in text and also in data from sources…
A generative model is a mathematical formulation that generates a sample similar to real data. Many such models have been proposed using machine learning methods, including deep learning. Study of…
For mathematical models of language, their potential, limitations, and ways of improvement are investigated in terms of whether they reproduce the complex properties of language. The nature of linguistic structure…
State-of-the-art word embedding methods represent a word with a single vector and presume a linear vector space, which does not easily incorporate nonlinearity that is necessary to, for example, represent…