Speaking in Self-Assessing Tongues: On the Verbalized Confidence of LLMs in Machine Translation

Short summary

Researchers compared LLMs' internal confidence signals with five verbalized methods for extracting per-token certainty in machine translation. Both approaches showed similar reliability for error detection and calibration, though performance varied across models. Internal and verbalized methods showed little correlation, suggesting they measure different aspects of model confidence.

•Five verbalized confidence extraction methods developed as alternatives to internal model signals
•Both internal and verbalized approaches showed comparable reliability in error detection and calibration
•Little correlation found between internal and verbalized methods, indicating they measure different aspects

Generated with AI, which can make mistakes.

#research-breakthrough #ai-tools

Read full article at arXiv cs.CL

Is this a good recommendation for you?

Speaking in Self-Assessing Tongues: On the Verbalized Confidence of LLMs in Machine Translation

Short summary

Explore more