State-of-the-art NLP benchmarks require interpretation of natural language that specifies conditions, procedures, and exceptions, often relying on implicit assumptions and external knowledge. Constructing complete semantic…
Language
Evaluating whether large language models (LLMs) capture the structure of natural language beyond local fluency remains an open challenge. Existing evaluation methods, largely based on…
Large language models (LLMs) such as ChatGPT are increasingly used in the cultural heritage domain for tasks like metadata creation, semantic enrichment, and artwork captioning….
Mode collapse is a persistent challenge in generative modeling and appears in autoregressive text generation as behaviors ranging from explicit looping to gradual loss of…
Heading Large language models (LLMs) have achieved remarkable progress in naturallanguage generation, yet they continue to display puzzling behaviors—such asrepetition and incoherence—even when exhibiting low…
This paper proposes formulating Zipf’s meaning-frequency law, the power law between word frequency and the number of meanings, as a relationship between word frequency and…
Clustering is a fundamental technique in machine learning and data mining, offering a powerful lens to understand self-organizing patterns in the real world. At its…
Documents have complexity from various perspectives, such as compression rate and the degree of fluctuation. The complexity varies depending on the extent to which the…
We proposed a stock vector representation called “stock embedding,” obtained using a deep learning framework that utilizes news articles and stock price history. This embedding…
The Strahler number was originally proposed to characterize the complexity of river bifurcation and has found various applications. This article proposes computation of the Strahler…