Discriminant Analysis

Today no business can run without predicting and forecasting a variable or data point. Prediction and forecasting has become an important and core aspect of businesses like Capital & Stock market, Medical and laboratory research, Aerospace and mechanical industry, educational services, recruitment and many other corporate functions. All these functions use prediction, forecasting at different degrees, types and requirements.

One important aspect of prediction is discriminating a ‘Good’ sample from a ‘Bad’ sample; Or otherwise, categorizing a given list of samples into definite groups based on certain characteristics portrayed by each group. An ideal example would be identifying a potential candidate from a group of candidates applied for a job position based on his educational background, marks obtained, assessment scores, personality traits, family background and other parameters.

There are a variety of tools like Discriminant Analysis, Factor Analysis etc. Discriminant analysis,(as the name goes) is a method used to categorize samples into two or more groups given a set of known samples in each group with their characteristic variables.

The functions of a Discriminant Analysis are:

  1. Categorize the samples into any of the given groups
  2. Build a model/equation to predict each category
  3. Identify the features/parameters that influence the samples to fall in different categories
  4. Identify how distinct and different the categories are from each other

How does a Discriminant Analysis do all this?

  1. Collect data on different categories taken for study
  2. Collect known sample values on underlying parameters that are responsible for categorization
  3. Calculate the distance between groups to find that they are distinct
  4. Calculate the function value (Otherwise called as coefficient) for each parameter and develop the equation for each category
  5. Identify and filter only the significant parameters

One need not worry how to perform these calculations. It is simple enough to identify these values in software and to interpret them for real time problems. There a plenty of applications and software like SPSS, Minitab and other Excel based tools that are available for the internal calculations.