LI Honglie, XIA Dong, WANG Qian. A Sampled Data Cleaning Technology Based on Regression Model[J]. Electronics Optics & Control, 2022, 29(4): 117
Copy Citation Text
Data cleaning is an important content in data preprocessingbut problems such as outlier missing and outlier influence exist in current data cleaning technology.A dynamic and fine identification algorithm for outliers based on regression model is proposedin which the regressive values of two data segments ahead of and after the current position are set as referenced values after the elimination of potential outlierswhich is used together with the limits of parameters change rate to give the judgement of outliers.Data cleaning procedure based on regression model is also givenin which steps of coarse identificationfine identification and regressive estimation are adopted to improve the efficiency and effects of data cleaning.A set of real aeronautical data sampled is used to certify the proposed methodand the processing results show that the data cleaning technology based on regression model is able to identify and estimate outliers accurately.