Use of biplot technique for the comparison of the missing value imputation methods


Creative Commons License

Alkan B. B., Alkan N., Atakan C., Terzi Y.

Int. J. Data Analysis Techniques and Strategies, cilt.7, sa.3, ss.217-230, 2015 (Scopus) identifier

Özet

This study was performed to assess the effects of different

imputation methods on the performance of a biplot technique. We selected the

Fisher’s iris data as our reference dataset. Some elements of the Iris data were

deleted in different rates under missing at random (MAR) assumption to

generate incomplete datasets which had 3.5%, 7%, %15, 20% missing value.

Datasets with missing values were completed by four imputation methods

[mean imputation, regression imputation, expectation maximisation (EM)

algorithm, multiple imputation (MI)]. The new imputed datasets were analysed

by biplot technique and their results were compared with original complete

biplot of the data. The results of biplot analysis were similar in all the

imputation methods when missing rate is low under MAR assumption. Even

when the missing rate was greater than 10%, results of EM and MI methods

were similar to real values and graphical representation of original data. For

multivariate methods, we also propose filling in the missing value with the

arithmetic mean of the imputed estimates which are obtained with multiple

imputation. This paper also indicates that the use of biplot technique for the

comparison of the missing value imputation methods provides a useful visual

tool.