Loading…
Attending this event?
Thursday August 8, 2024 12:15pm - 2:15pm IST
Authors - R. Usharani, P. Saranya, T.Johnpeter, D.Anandhababu
Abstract - Multicollinearity (MC) occurs when various variables in a multiple linear regression analysis are substantially correlated not only with the response variable, but also with each other. Due to MC, several of the relevant variables under investigation become statistically insignificant. Conversely, overfitting occurs when the model fits the training set too closely, resulting in unwanted data being captured instead of the patterns that lie beneath. This research addresses three basic strategies for finding MC in Thyroid Disease dataset. The RE estimator (RE) compared with Principal Component Analysis (PCA) and also compared with least absolute shrinkage and selection operator (Lasso) and the results summarized. Principal components are used by PCA to represent the data instead of performing explicit Feature Selection (FS). Since these elements are only linear combinations of the original traits, it could be difficult to see how they contribute. A small number of factors are chosen by the PCA with the help of information gain at an acceptable level of P values of 0.4 and below. With P values of 0.5 and lower, the RE automatically chooses factors. With more acceptable P values, like 0.03, 0.02 and 0.007, and so forth, the Lasso automatically chooses factors. RE and Lasso have chosen a log lamda of 0.4 and 0.3, respectively, to resolve MC. The resolution of MC for the RE and PCA is 0.49 and 0.51, respectively. The RE incorporates all of the model's properties while penalizing the coefficients to keep them from growing excessively. The Lasso model drives some coefficients absolutely to zero in order to carry out automatic FS. It offers a limited collection of chosen features with nonzero coefficients, making it possible to create a more understandable model that concentrates on the most significant predictors. But we can use RR to choose the model's more salient features. On the other hand, a preferable alternative is to utilize Lasso to choose the model's minimum number of salient features. In contrast, MC resolution and FS can be handled concurrently by RE and Lasso. But the PCA wasn't the best choice for FS in parallel; it was only better at resolving MC.
Paper Presenter
Thursday August 8, 2024 12:15pm - 2:15pm IST
Virtual Room B Goa, India

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link