Judges with the same conviction rate might choose to convict different defendants, which violates an important standard of court quality. I show how judge disagreement can be nonparametrically bounded using information on defendant characteristics, or from other court decisions on the same cases. I implement the procedure for a Canadian refugee appeal court, and bound disagreement for the average pair of similarly-severe judges at 10% of all cases, higher than the amount of disagreement coming from cross-judge variation in leniency and large relative to the overall approval rate of only 14%. I aggregate judge-pair disagreement up into a judgespecific measure of decision quality I call consistency, and build a structural model to study the judge and institutional characteristics associated with it. Finally, I show how inconsistency implies failure of the monotonicity assumption in examiner-assignment IV designs, and adapt my bounding method into a test that is more powerful than current approaches.