Fact Based List:

Vincent Granville: The 8 Worst Predictive Modeling Techniques

Submitted by Anonymous on Mon, 10/01/2012 - 15:13


  1. Linear regression. Relies on normal, heteroscedasticity & other assumptions, doesn’t capture highly non-linear, chaotic patterns. Prone to over-fitting. Parameters difficult to interpret.
  2. Traditional decision trees. Very large decision trees are very unstable and impossible to interpret, and prone to over-fitting.
  3. Linear discriminant analysis. Used for supervised clustering. Bad technique because it assumes that clusters do not overlap, and are well separated by hyper-planes
  4. K-means clustering. Used for clustering, tends to produce circular clusters. Does not work well with data points that are not a mixture of Gaussian distributions.
  5. Neural networks. Difficult to interpret, unstable, subject to over-fitting.
  6. Maximum Likelihood estimation. Requires data to fit with a prespecified probabilistic distribution. Not data-driven. In many cases the pre-specified Gaussian distribution is terrible fit for data.
  7. Density estimation in high dimensions. Subject to what is referred to as the curse of dimensionality.
  8. Naive Bayes. Used e.g. in fraud and spam detection, and for scoring. Assumes that variables are independent, if not it will fail miserably.


Source: AnalyticBridge
Source URL: http://www.analyticbridge.com/profiles/blogs/the-8-worst-pre...



List Ratings:   
0
No votes yet
Your rating: None

Lists You Might Also Be Interested In



Login or register to post comments