We show that a GPT-3 model can learn to precise uncertainty about its own answers in natural language—without use of model logits. When given a matter, the model generates each a solution and a level of confidence (e.g. “90% confidence” or “high confidence”). These levels map to probabilities which can be well calibrated. The model also stays moderately calibrated under distribution shift, and is sensitive to uncertainty in its own answers, somewhat than imitating human examples. To our knowledge, that is the primary time a model has been shown to precise calibrated uncertainty about its own answers in natural language. For testing calibration, we introduce the CalibratedMath suite of tasks. We compare the calibration of uncertainty expressed in words (“verbalized probability”) to uncertainty extracted from model logits. Each sorts of uncertainty are able to generalizing calibration under distribution shift. We also provide evidence that GPT-3’s ability to generalize calibration is determined by pre-trained latent representations that correlate with epistemic uncertainty over its answers.