Why standardization is good




















Z-score is one of the most popular methods to standardize data, and can be done by subtracting the mean and dividing by the standard deviation for each value of each feature. Once the standardization is done, all the features will have a mean of zero, a standard deviation of one, and thus, the same scale. As seen above, for distance based models, standardization is performed to prevent features with wider ranges from dominating the distance metric.

But the reason we standardize data is not the same for all machine learning models, and differs from one model to another. Clustering models are distance based algorithms, in order to measure similarities between observations and form clusters they use a distance metric. So, features with high ranges will have a bigger influence on the clustering. Therefore, standardization is required before building a clustering model.

Standardization makes all variables to contribute equally to the similarity measures. Support Vector Machine tries to maximize the distance between the separating plane and the support vectors. If one feature has very large values, it will dominate over other features when calculating the distance.

So Standardization gives all features the same influence on the distance metric. You can measure variable importance in regression analysis, by fitting a regression model using the standardized independent variables and comparing the absolute value of their standardized coefficients. But, if the independent variables are not standardized, comparing their coefficients becomes meaningless.

LASSO and Ridge regressions place a penalty on the magnitude of the coefficients associated to each variable. And the scale of variables will affect how much penalty will be applied on their coefficients.

Because coefficients of variables with large variance are small and thus less penalized. Therefore, standardization is required before fitting both regressions. Logistic Regression and Tree based algorithms such as Decision Tree, Random forest and gradient boosting, are not sensitive to the magnitude of variables. Standards don't preclude extensions : The fact that my product adheres to a standard doesn't prevent me from adding extra features that are unique to my product.

Often, these extras deal with cutting edge technology, as I explained in item 1. Other times, however, they are just something the designers thought was a better way to do the same thing. Opinions vary among programmers a fact I'm sure isn't lost on ZDNet readers, or Talkback participants , which is why standards rarely are a trump card in battles over the "right" way to do something. It's not just proprietary companies seeking advantage who do this.

This isn't something to condemn. A standard ensures interoperability, but it is not supposed to put a muzzle on creativity and prevent programmers from building a better mousetrap.

Programmers often find these innovative features interesting. In order to generate economics of scale around these features, therefore, the decision is often made to standardize. Standards are an essential part of modern software ecosystems. In an ideal world, everything could and would be standardized. We do not live in that ideal world, and thus standards rarely manage to account for more than a small fraction of the computing surface area.

Standardization is often the only way to achieve the goals of standards committees, which is why standardization is so common in the computing industry.

Windows Phone prospects in the developing world. Hulu buyers, beware. The worst decision Google ever made. Net neutrality vs.

You agree to receive updates, promotions, and alerts from ZDNet. You may unsubscribe at any time. By signing up, you agree to receive the selected newsletter s which you may unsubscribe from at any time. You also agree to the Terms of Use and acknowledge the data collection and usage practices outlined in our Privacy Policy. Microsoft repents of its open-source.

NET blunder. Microsoft almost blew much of the good will they've gotten from open-source developers, but now the company has reversed course. How APIs can turn your business into a platform. APIs are the new normal. They offer a lot of potential, drive innovation, save cost, and allow developers to self-serve their needs.

But there's a lot more to building great APIs than His boss said the spy camera proved he was lazy. His response was brilliant. Companies are increasingly using technology to surveil their employees. But the camera does lie. Your email address will not be published.

Save my name, email, and website in this browser for the next time I comment. They are helpful in cross-comparison and the actual numbers have a significance to those of us with a history using this rating. A few discussions can shed a lot of light on any confusing data. One-on-one discussions are still the gold standard in my book.

Submit a Comment Cancel reply Your email address will not be published. Search for:.



0コメント

  • 1000 / 1000