Assessing Moral Judgment by Large Language Models – A Survey of Available Datasets

Authors

DOI:

https://doi.org/10.3384/nejlt.2000-1533.2026.6366

Abstract

Recent advances in language modeling have contributed to a growing emphasis on machine ethics, including researching and assessing moral judgments made by large language models (LLMs). This paper provides a critical survey of the existing datasets for exactly this assessment, with a special focus on the respective data sources. We address the current lack of theoretical grounding by providing an introduction to ethics and different frameworks from moral philosophy and moral psychology. Moreover, we identify four main data sources: webcrawled corpora, scholars, laypeople, and synthetic data generation. By discussing the strengths and weaknesses of these sources, we analyze their implications for the assessment of moral judgment. Importantly, systemizing the available datasets reveals an over-reliance on previous work, reinforcing existing shortcomings. Addressing the current limitations, we recommend adopting a consistent terminology and creating independently curated datasets based on interdisciplinary work. To ensure a clear delineation of normative approaches, we propose focusing on the assessment of moral consistency and certainty of LLMs as effective and well-defined indicators of their performance on moral judgment.

Downloads

Published

2026-06-10 — Updated on 2026-06-11

Versions