Marketing modelling for large-scale assortments



Consumers who shop (online) are often overloaded with information, choice options, web-sites, links to other web-pages, etc. Managers of (online) stores need to make many decisions: which products to sell, what prices to charge, which promotions to organize, which customers to target, etc. In principle a lot of data is available to optimize such decisions. However, both the dimension and level of detail of the data make it challenging to actually use the data. In this project you will work on the development and application of econometric marketing models that (i) can help guide practical decision making; (ii) are scalable (that is, can be estimated in a reasonable time frame); (iii) combine different data sources; and (iv) work at the individual product and/or customer level, such that customization of prices and promotions is feasible.

In this project we will seek active collaboration with Dutch or international (online) retailers. These contacts will help us target practically useful research questions and provide access to detailed data. In the ideal case we will be able to test the developed methodology in real life.


Marketing, econometrics, machine learning, prediction, recommendation, online retailing, dynamic pricing


Research questions
Various concrete research questions fall in this domain. We discuss some examples below. Note that this is merely a first inventory of the possibilities.

1. Dynamic pricing
The optimal pricing of products is a key challenge for all retailers. In an online setting the pricing problem is especially complicated as customers can easily check the prices of competitors. To be able to optimally set the price for many products in a category one does not only need to know the own and cross price effects, but also the relevance of competitor prices. To complicate things even further, the store may take own (or competitive) inventory into account and adapt prices accordingly. Finally, prices that are chosen today will have an impact on demand in the future. This means that price setting is not a static but a dynamic problem. Effective models that can be used to capture the relevant dynamics for a large number of products simultaneously are not available yet.

2. Keyword advertising
A major source of website traffic to stores is (sponsored) keyword advertising at search engines such as Google and Bing. Retailers can bid on keywords such that their site gets ranked high among the sponsored search results. However it is not directly clear what the added value of such a bid is to the retailer, nor is it clear what the optimal bid should be. An interesting challenge is to model the impact of keyword advertising as part of the consumers’ path to purchase (the conversion funnel).

3. Targeted promotions
Online retailers can promote specific products to a selected group of customers through, for example, email marketing. How can we figure out which products to promote to which customers? It does not make sense to promote a product in which the customer is not interested. On the other hand it is not smart to promote products that are already well known to the customer (they would also buy them without the promotion). How to strike an optimal balance here? How can we model this problem for many customers and many products simultaneously? And possibly even more complicated, how can we account for situational influences in determining the optimal promotion strategy?

Research field
The research falls in two of the main research domains of Erasmus School of Economics: Marketing and Econometrics. It corresponds to a key research area as defined by the marketing group (quantitative analysis of customer behaviour) and to a key area as defined by the Econometric Institute (discrete choice analysis).


To be able to deal with complex heterogeneous data Bayesian modelling is the ideal tool. However, the typical estimation methodology for such models requires time-consuming simulations. In a large-scale setting (many products and/or many consumers) such simulation-based methodology is not feasible. Alternative methods are available that allow researchers to remain in the Bayesian paradigm, but remove the need for simulation. One of such methods is called Variational inference [VI]. VI is an approach to obtain approximate inference in Bayesian models using optimization techniques.

A major component of this project will be the development of scalable estimation techniques. VI is one of the promising candidates.

PhD candidate profile

This project requires a candidate with a strong background in econometrics and/or statistics. Furthermore some affinity with working with large-scale data is useful. Ideally, the candidate already has experience with Bayesian statistics. Next to these technical skills a keen interest for marketing is needed.

Expected output

Within this project at least three high quality papers are expected. Each of the papers will be targeted to one of the top journals in (quantitative) marketing, that is, Journal of Marketing Research or Marketing Science or a top journal in statistics or econometrics, for example, Journal of Business and Economic Statistics or Journal of Econometrics.


No specific research groups are identified for collaboration. However, it is the aim to have the PhD candidate visit a top US (or European) university for a couple of months during the second or third year of the project.

Societal relevance

The societal relevance for this project is mainly reflected in the collaboration with industry. The supervisory team already has various contacts with (online) retailers in the Netherlands and has worked with them in the past on academic research projects. These (and other) companies will clearly benefit from this research. For them the benefit is in being able to optimize their everyday decision making using model-based tools.

Scientific relevance

In the academic marketing literature there is a strong interest in the development and application of quantitative methods. This is especially true if the methods help solve actual decision problems that marketing managers confront.

The added value of this project will be in the following aspects:

  • Development of new econometric methodology to deal with large-scale data
  • Providing guidelines for various product-level decisions faced by (online) retailers: pricing, promotion, assortment decisions, etc.
  • Actual implementation and testing of methodology in real life

Contact information

For academic questions only. For procedural questions, contact the Doctoral Office.


Sunday, 1 April 2018

Literature references

A good example of the type of research in this project is:

This paper is the one of the results of the PhD thesis of Bruno Jacobs (to be defended in December 2017).

Data will be obtained through cooperation with companies. Some data is also more publicly available for example the Instacart data or data through the Wharton Customer Analytics Initiative.

Supervisory Team

Dennis Fok
Professor of Applied Econometrics
  • Promotor
Bas Donkers
Professor of Marketing Research
  • Copromotor