State-of-the-art NLP benchmarks require interpretation of natural language that specifies conditions, procedures, and exceptions, often relying on implicit assumptions and external knowledge. Constructing complete semantic representations with proof-theoretic guarantees is…
Inference, Reasoning
3 Articles
3
Evaluating whether large language models (LLMs) capture the structure of natural language beyond local fluency remains an open challenge. Existing evaluation methods, largely based on task performance or short-context behavior,…
Documents have complexity from various perspectives, such as compression rate and the degree of fluctuation. The complexity varies depending on the extent to which the document is based on “inference.”…