Engineering: Confidence scoring
Engineering: Confidence scoring
Requirements for a confidence score
- Value between 0.0 and 1.0
- Interpretable as probability of correct decision
- Validate using histogram of probability estimates
- Probability estimate ratio should increase monotonically
First try: Output of the MLP alone
- Almost all values fall between 0.0 and 1.0
- Few negative values can be clipped to zero
- Distribution of values is not even
- Probability estimate ratio does not increase monotonically
Second try: Hybrid combination of MLP and hard limits
- When hard limits make decision, set output to 1.0
- When MLP makes decision, use output as above
- About 60% of decisions are made by hard limits
- Distribution of values is not even
- Probability estimate ratio does not increase monotonically
Reasons for the inadequacy of MLP output
- Convex hull recursion judges dips, not peaks, to make decision
- Landmark data is not available for deletions (majority of errors)
- Optimized for error rate, not sum-of-squares error
Other possible measures
- Train separate MLP as confidence estimator
- Rework the convex hull algorithm
- Avoid use of hard limits
- Judge peaks instead of dips
- Alternative to convex hull algorithm