Machine Learning Optimization based on Genetic Algorithm for Market Share Maximization
Every business is driven by a set a factors, which directly or indirectly contributes to the sale of the product/service and hence the market share. This article talks about how we can use the optimum values for those factors so that sales/market share can be maximized.
Steps involved in this process are the following
A) Identify the features impacting market share: Some level of business knowledge or domain based input would be needed to do the analysis. Understand the features which contribute to sales and market share of the product. We can use regression analysis, correlation analysis etc to find the impact of these factors on the market share. If there are too many features which impact the sales, it will be a good idea to reduce the features based on feature importance. Some of the features which can impact market share are the following
1. Pricing and Promotion parameters
2. Ratings and Reviews
3. Advertising information
4. Sales channel specific data
5. Different domain specific metrics which impacts market share / sales
6. Availability of the product/service
7. Competitor pricing
8. Location and Demographic Information
The above list is not complete. There could be many other factors specific to the nature of the business.
There are different feature selection technique which can be used to select the most important features to reduce the complexity of the model.
B) Prepare the dataframe: Once the feature identification is done, prepare a dataframe, df, with all the features and the achieved market share at a suitable granularity.
C) Define the constraints: Carefully choose the lower and upper bound of the features based on the domain / feasible range.
D) Scale the feature values using Min-Max Scaler: Scale the feature values in the 0 to 1 scale with 0 for the lower bound and the 1 for the upper bound and all other values in between
E) Define the objective function: The quantity we need to maximize. The sum of individual market share/sales could be a fitness function.
F) Relate the factors with market share: Fit a machine learning model, which learns the historical pattern and then predict the market share for a given combination of these factors. There is an array of regression models from we choose an appropriate machine learning model for predicting the sales/market share.
G) Using the Genetic Algorithm(GA) Optimizer: GA is an evolutionary based optimization technique inspired by the theory of natural selection. It simulates a set of values with in the constraints specified and evolve based on the ‘fitness’ function defined. The fitness function could be the sum of market share( for the dataframe,df, which is the population in GA terminology) or any other appropriate evaluation function which we need to maximize/minimize. There are operators like selection, which gives higher probability to ‘fitter’ individuals to be selected for mating and then operators like cross over, which swaps a set of values between ‘parents’ and mutation which introduces random changes to feature values. More about GA can be read in the url:
https://en.wikipedia.org/wiki/Genetic_algorithm
Over a period of ‘generations’ different feature values would be tried out and fitness be evaluated. Over successive generations, we reach a stage in which improvement in market share(fitness) will become negligible or nil. That time we can stop the iterations(or until we reach the desired improvement in fitness). At this stage the population will comprises of the fittest individuals, meaning the best set of feature values, which maximized the objective function.
From the population fitness(Sum of individual fitness) of the original and optimal populations, we can see the percentage improvement in the market share we can expect. Also, by comparing the mean of the feature values of the original and optimal dataframes, we can find the mean percentage change needed for the feature values.
Python has DEAP library which automates a lot of these operations making it easier to implement GA.