Solution to Management Science Series #198: Using K-Means Clustering for Customer Segmentation

Using K-Means Clustering for Customer Segmentation
Even Excel, albeit sometimes clumsy, can be a data analytics tool.

Try to solve the following question by using Excel:

Wine retailer, Porto, had 32 different promotion campaigns. You have the information on promotion month, minimum quantity to be purchased to benefit from promotions, discount percentages, varietals, wine’s origin, and whether the wine has passed its peak or not.

All this information is captured by a single Excel spreadsheet.

Then, you also have customer transaction data in a separate spreadsheet. You see that there had been 324 transactions completed.

Segment customers by using k-means clustering method. Use Excel to do so!

On Machine Learning Techniques: Interpretability vs More Insight

On Machine Learning Techniques
⬇️

Ability to predict a categorical response more reliably is what machine learning techniques are after.

A data scientist may rely on classification and regression trees because the focus is more on prediction than on how each descriptor will contribute to the response.

However, in some special cases, you want to see how a change in the descriptor will change the response and to what extent.

Classification and regression trees facilitate interpretation, but this clear interpretation always comes at the expense of less insight into how each descriptor contributes to the response you are trying to predict.

In this article, I will analyze which method to use and under what condition each method trumps another method.

Using hypothesis testing in important marketing and e-commerce decisions

This time, the focus is on e-commerce and digital marketing.

⬇️

Hypothesis testing is underutilized in many sectors although it is cheap, effective, and quick.

From manufacturing and medicine to marketing and e-commerce, its applications are abundant.

Quick experiment follows ⬇️

You are contemplating adding a new feature to your e-commerce website.

You are wondering whether this new feature will lift your sales.

You can randomly channel your customers to both sites, i.e. one without the feature and the other with the feature.

You will launch the new feature provided that you are 95% confident that website with the feature will yield more sales than that without the feature.

How would you design this experiment so that you can make a decision based on it?

What do you need to measure? Which metrics?

In this article, I show how you design this experiment and solve this problem cost- and time-efficiently.