The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Rótulos ruidosos em dados de treinamento podem prejudicar significativamente o desempenho de redes neurais profundas (DNNs). Pesquisas recentes sobre aprendizagem com rótulos ruidosos usam uma propriedade das DNNs chamada efeito de memorização para dividir os dados de treinamento em um conjunto de dados com rótulos confiáveis e um conjunto de dados com rótulos não confiáveis. Métodos que introduzem estratégias de aprendizagem semissupervisionadas descartam os rótulos não confiáveis e atribuem pseudo-rótulos gerados a partir das previsões confiáveis do modelo. Até agora, esta estratégia semissupervisionada produziu os melhores resultados neste domínio. No entanto, observamos que mesmo quando os modelos são treinados em dados balanceados, a distribuição dos pseudo-rótulos ainda pode apresentar um desequilíbrio que é impulsionado pela similaridade dos dados. Além disso, observa-se um viés de dados que se origina da divisão dos dados de treinamento pelo método semissupervisionado. Se abordarmos ambos os tipos de preconceitos que surgem dos pseudo-rótulos, podemos evitar a diminuição no desempenho da generalização causada por preconceitos. ruidoso pseudo-rótulos. Propomos um método de aprendizagem com rótulos ruidosos que introduz pseudo-rotulagem imparcial baseada em inferência causal. O método proposto alcança ganhos significativos de precisão em experimentos com altas taxas de ruído nos benchmarks padrão CIFAR-10 e CIFAR-100.
Ryota HIGASHIMOTO
Kansai University
Soh YOSHIDA
Kansai University
Takashi HORIHATA
Kansai University
Mitsuji MUNEYASU
Kansai University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Ryota HIGASHIMOTO, Soh YOSHIDA, Takashi HORIHATA, Mitsuji MUNEYASU, "Unbiased Pseudo-Labeling for Learning with Noisy Labels" in IEICE TRANSACTIONS on Information,
vol. E107-D, no. 1, pp. 44-48, January 2024, doi: 10.1587/transinf.2023MUL0002.
Abstract: Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2023MUL0002/_p
Copiar
@ARTICLE{e107-d_1_44,
author={Ryota HIGASHIMOTO, Soh YOSHIDA, Takashi HORIHATA, Mitsuji MUNEYASU, },
journal={IEICE TRANSACTIONS on Information},
title={Unbiased Pseudo-Labeling for Learning with Noisy Labels},
year={2024},
volume={E107-D},
number={1},
pages={44-48},
abstract={Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.},
keywords={},
doi={10.1587/transinf.2023MUL0002},
ISSN={1745-1361},
month={January},}
Copiar
TY - JOUR
TI - Unbiased Pseudo-Labeling for Learning with Noisy Labels
T2 - IEICE TRANSACTIONS on Information
SP - 44
EP - 48
AU - Ryota HIGASHIMOTO
AU - Soh YOSHIDA
AU - Takashi HORIHATA
AU - Mitsuji MUNEYASU
PY - 2024
DO - 10.1587/transinf.2023MUL0002
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E107-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2024
AB - Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.
ER -