Authors:    Amer-Yahia, S., Anh Tho Le and Eric Simon

Published in:   BIAS 2020

Abstract:    Rated datasets are characterized by a combination of user demographics such as age and occupation, and user actions such as rating a movie or reviewing a book. Their exploration can greatly benefit end-users in their daily life. As data consumers are being empowered, there is a need for a tool to express end-to-end data pipelines for the personalized exploration of rated datasets. Such a tool must be easy to use as several strategies need to be tested by end-users to find relevant information. In this work, we develop a framework based on mining labeled segments of interest to the data consumer. The difficulty is to find segments whose demographics and rating behaviour are both relevant to the data consumer. The variety of ways to express that task fully justifies the need for a productive and effective programming environment to express various data pipelines at a logical level. We examine how to do that and validate our findings with experiments on real rated datasets.