Abstract The N400 component of ERPs is modulated by how predictable a word is, but predictability is usually quantified with lexical cloze —the probability that readers supply that exact word in offline sentence completion tasks. This form-based metric is at odds with decades of evidence that the N400 is primarily sensitive to meaning. Here, we asked whether a measure of semantic feature predictability can better account for N400 amplitude modulation. We reanalysed two independent EEG datasets (N = 26 and N = 334), computing lexical and semantic cloze for each critical word. Across both datasets, semantic cloze emerged as a better predictor of the N400 data than lexical cloze. Using the same materials, we then compared semantic and lexical cloze with probabilities from four large language models (GPT-2, GPT-2.7b, RoBERTa, ALBERT). None of the LLM-derived predictors outperformed semantic cloze. Our findings support the view that the N400 primarily reflects semantic—not exact-word—processing. Methodologically, we argue that replacing lexical cloze with semantic cloze can substantially increase the explanatory power of N400 studies, and caution against substituting human norms with raw LLM probabilities.