The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
O significar, mediana e modo são geralmente calculados a partir de observações univariadas como os valores representativos mais básicos de uma variável aleatória. Para medir a propagação da distribuição, o desvio padrão, intervalo interquartil e intervalo modal também são calculados. Quando analisamos relações contínuas entre um par de variáveis aleatórias a partir de observações bivariadas, análise de regressão é frequentemente usado. Ao minimizar os custos apropriados avaliando os erros de regressão, estimamos a média condicional, a mediana e a moda. O desvio padrão condicional pode ser estimado se as observações bivariadas forem obtidas a partir de um processo gaussiano. Além disso, o intervalo interquartil condicional pode ser calculado para várias distribuições pelo regressão quantílica que estima qualquer quantil condicional (percentil). Enquanto isso, o estudo da regressão intervalar modal é relativamente novo, e modelos de regressão spline, conhecidos como modelos flexíveis que possuem otimalidade na suavidade para dados bivariados, ainda não são utilizados. Neste artigo, propomos um método de regressão intervalar modal baseado na regressão quantílica spline. O método proposto consiste em duas etapas. Na primeira etapa, dividimos as observações bivariadas em compartimentos para uma variável aleatória e, em seguida, detectamos o intervalo modal para a outra variável aleatória como os quantis inferior e superior em cada compartimento. Na segunda etapa, estimamos o intervalo modal condicional construindo curvas de quantil inferior e superior como funções spline. Ao utilizar a regressão quantílica spline, o método proposto é amplamente aplicável a diversas distribuições e formulado como um problema de otimização convexa nos vetores de coeficientes das funções spline inferior e superior. Experimentos extensos, incluindo configurações da largura do compartimento, parâmetro de suavização e pesos na função de custo, mostram a eficácia da regressão de intervalo modal proposta em termos de precisão e formato visual para dados sintéticos gerados a partir de várias distribuições. Experimentos com dados meteorológicos do mundo real também demonstram um bom desempenho do método proposto.
Sai YAO
Ritsumeikan University
Daichi KITAHARA
Osaka University
Hiroki KURODA
Ritsumeikan University
Akira HIRABAYASHI
Ritsumeikan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Sai YAO, Daichi KITAHARA, Hiroki KURODA, Akira HIRABAYASHI, "Modal Interval Regression Based on Spline Quantile Regression" in IEICE TRANSACTIONS on Fundamentals,
vol. E106-A, no. 2, pp. 106-123, February 2023, doi: 10.1587/transfun.2022EAP1031.
Abstract: The mean, median, and mode are usually calculated from univariate observations as the most basic representative values of a random variable. To measure the spread of the distribution, the standard deviation, interquartile range, and modal interval are also calculated. When we analyze continuous relations between a pair of random variables from bivariate observations, regression analysis is often used. By minimizing appropriate costs evaluating regression errors, we estimate the conditional mean, median, and mode. The conditional standard deviation can be estimated if the bivariate observations are obtained from a Gaussian process. Moreover, the conditional interquartile range can be calculated for various distributions by the quantile regression that estimates any conditional quantile (percentile). Meanwhile, the study of the modal interval regression is relatively new, and spline regression models, known as flexible models having the optimality on the smoothness for bivariate data, are not yet used. In this paper, we propose a modal interval regression method based on spline quantile regression. The proposed method consists of two steps. In the first step, we divide the bivariate observations into bins for one random variable, then detect the modal interval for the other random variable as the lower and upper quantiles in each bin. In the second step, we estimate the conditional modal interval by constructing both lower and upper quantile curves as spline functions. By using the spline quantile regression, the proposed method is widely applicable to various distributions and formulated as a convex optimization problem on the coefficient vectors of the lower and upper spline functions. Extensive experiments, including settings of the bin width, the smoothing parameter and weights in the cost function, show the effectiveness of the proposed modal interval regression in terms of accuracy and visual shape for synthetic data generated from various distributions. Experiments for real-world meteorological data also demonstrate a good performance of the proposed method.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2022EAP1031/_p
Copiar
@ARTICLE{e106-a_2_106,
author={Sai YAO, Daichi KITAHARA, Hiroki KURODA, Akira HIRABAYASHI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Modal Interval Regression Based on Spline Quantile Regression},
year={2023},
volume={E106-A},
number={2},
pages={106-123},
abstract={The mean, median, and mode are usually calculated from univariate observations as the most basic representative values of a random variable. To measure the spread of the distribution, the standard deviation, interquartile range, and modal interval are also calculated. When we analyze continuous relations between a pair of random variables from bivariate observations, regression analysis is often used. By minimizing appropriate costs evaluating regression errors, we estimate the conditional mean, median, and mode. The conditional standard deviation can be estimated if the bivariate observations are obtained from a Gaussian process. Moreover, the conditional interquartile range can be calculated for various distributions by the quantile regression that estimates any conditional quantile (percentile). Meanwhile, the study of the modal interval regression is relatively new, and spline regression models, known as flexible models having the optimality on the smoothness for bivariate data, are not yet used. In this paper, we propose a modal interval regression method based on spline quantile regression. The proposed method consists of two steps. In the first step, we divide the bivariate observations into bins for one random variable, then detect the modal interval for the other random variable as the lower and upper quantiles in each bin. In the second step, we estimate the conditional modal interval by constructing both lower and upper quantile curves as spline functions. By using the spline quantile regression, the proposed method is widely applicable to various distributions and formulated as a convex optimization problem on the coefficient vectors of the lower and upper spline functions. Extensive experiments, including settings of the bin width, the smoothing parameter and weights in the cost function, show the effectiveness of the proposed modal interval regression in terms of accuracy and visual shape for synthetic data generated from various distributions. Experiments for real-world meteorological data also demonstrate a good performance of the proposed method.},
keywords={},
doi={10.1587/transfun.2022EAP1031},
ISSN={1745-1337},
month={February},}
Copiar
TY - JOUR
TI - Modal Interval Regression Based on Spline Quantile Regression
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 106
EP - 123
AU - Sai YAO
AU - Daichi KITAHARA
AU - Hiroki KURODA
AU - Akira HIRABAYASHI
PY - 2023
DO - 10.1587/transfun.2022EAP1031
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E106-A
IS - 2
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - February 2023
AB - The mean, median, and mode are usually calculated from univariate observations as the most basic representative values of a random variable. To measure the spread of the distribution, the standard deviation, interquartile range, and modal interval are also calculated. When we analyze continuous relations between a pair of random variables from bivariate observations, regression analysis is often used. By minimizing appropriate costs evaluating regression errors, we estimate the conditional mean, median, and mode. The conditional standard deviation can be estimated if the bivariate observations are obtained from a Gaussian process. Moreover, the conditional interquartile range can be calculated for various distributions by the quantile regression that estimates any conditional quantile (percentile). Meanwhile, the study of the modal interval regression is relatively new, and spline regression models, known as flexible models having the optimality on the smoothness for bivariate data, are not yet used. In this paper, we propose a modal interval regression method based on spline quantile regression. The proposed method consists of two steps. In the first step, we divide the bivariate observations into bins for one random variable, then detect the modal interval for the other random variable as the lower and upper quantiles in each bin. In the second step, we estimate the conditional modal interval by constructing both lower and upper quantile curves as spline functions. By using the spline quantile regression, the proposed method is widely applicable to various distributions and formulated as a convex optimization problem on the coefficient vectors of the lower and upper spline functions. Extensive experiments, including settings of the bin width, the smoothing parameter and weights in the cost function, show the effectiveness of the proposed modal interval regression in terms of accuracy and visual shape for synthetic data generated from various distributions. Experiments for real-world meteorological data also demonstrate a good performance of the proposed method.
ER -