Rebuttal ICML 2025

Lowest CSL
Fig 1: Lowest CSL Scores on CIFAR-100 captures easy-to-learn, Typical (likely generalized) examples. Visualized for the first 10 classes.
Highest CSL
Fig 2: Highest CSL Scores on CIFAR-100 capture hard-to-learn, Atypical (likely memorized) examples. Visualized for the first 10 classes.
Lowest CSL ImgNet
Fig 3: Lowest CSL Scores on ImageNet captures easy-to-learn, Typical (likely generalized) examples. Visualized for the first 10 classes.
Highest CSL  ImgNet
Fig 4: Highest CSL Scores on ImageNet capture hard-to-learn, Atypical (likely memorized) examples. Visualized for the first 10 classes.
Lowest CSL
Fig 5: CSL Histogram on CIFAR100, captures long tail
Highest CSL
Fig 6: CSL Histogram on ImageNet, captures long tail
Table 1: CSL correlation and similarity with memorization compared to other methods across CIFAR-100 and ImageNet datasets. Curv. on ImageNet RN50 was to expensive to compute within the deadline, will be included in the revised version
Dataset Arch Subset Method CS Pearson Corr.
CIFAR-100
Inception, Feldman & ZhangTop 5kFinal Sample Loss0.330.06
Curv0.870.16
Loss Sensitivity0.970.39
Forget Freq.0.960.29
CSL0.930.40
AllFinal Sample Loss0.240.17
Curv 0.690.49
Loss Sensitivity0.810.76
Forget Freq.0.760.59
CSL0.870.79
ImageNetRN18Top 50kFinal Sample Loss0.700.09
Curv0.870.07
Loss Sensitivity0.960.14
CSL0.920.15
AllFinal Sample Loss0.640.46
Curv0.720.49
Loss Sensitivity0.710.46
CSL0.710.56
RN50, Feldman & ZhangTop 50kFinal Sample Loss0.780.12
Curv--
Loss Sensitivity0.790.04
Forget Freq.0.680.15
CSL0.940.21
AllFinal Sample Loss0.640.50
Curv--
Loss Sensitivity0.490.17
Forget Freq.0.490.04
CSL0.790.64
Table 2: Evaluating the performance of mislabeled detection of the proposed framework against existing methods on CIFAR-10 and CIFAR-100 datasets under various levels of label noise.
Dataset Method Noise
15% Noise 20% Noise 25% Noise 30% Noise
CIFAR-10 Thr. Learning Time (LT) 0.4969 ± 0.0041 0.4981 ± 0.0004 0.4977 ± 0.0020 0.4988 ± 0.0028
In Conf. [Carlini et al.] 0.6224 ± 0.0130 0.5978 ± 0.0131 0.5800 ± 0.0051 0.5669 ± 0.0106
CL [Northcutt et al.] 0.7345 ± 0.1672 0.7169 ± 0.1539 0.6960 ± 0.1387 0.6794 ± 0.1264
SSFT [Maini et al.] 0.9233 ± 0.0029 0.9077 ± 0.0023 0.8910 ± 0.0050 0.8710 ± 0.0071
Loss Curvature [Garg et al.] 0.9827 ± 0.0019 0.9834 ± 0.0019 0.9849 ± 0.0014 0.9834 ± 0.0019
CSL (Ours) 0.9867 ± 0.0007 0.9870 ± 0.0003 0.9869 ± 0.0007 0.9866 ± 0.0009
CIFAR-100 Thr. Learning Time (LT) 0.5144 ± 0.0017 0.5119 ± 0.0059 0.5069 ± 0.0050 0.5041 ± 0.0002
In Conf. [Carlini et al.] 0.6706 ± 0.0052 0.6493 ± 0.0075 0.6324 ± 0.0051 0.6257 ± 0.0044
CL [Northcutt et al.] 0.7233 ± 0.1707 0.7030 ± 0.1565 0.6833 ± 0.1427 0.6662 ± 0.1289
SSFT [Maini et al.] 0.8495 ± 0.0002 0.8358 ± 0.0008 0.8203 ± 0.0016 0.8043 ± 0.0061
Loss Curvature [Garg et al.] 0.9886 ± 0.0009 0.9887 ± 0.0013 0.9885 ± 0.0016 0.9888 ± 0.0004
CSL (Ours) 0.9898 ± 0.0003 0.9897 ± 0.0003 0.9899 ± 0.0003 0.9899 ± 0.0002
Memorization & CSL vs Adversarial Distance (L2)
Fig 7: \( L_2 \) Adversarial distance vs Mem. score (Feldman & Zhang 2020) and CSL on CIFAR-100. Clearly, memorization and CSL increases for less robust samples (i.e. lower adv distance \( \|x - x_p\|_2 \)). Thus CSL captures similar properties as memorization