Home Artificial Intelligence Correct Sampling Bias for Recommender Systems Introduction Two-tower Suggestion systems

Correct Sampling Bias for Recommender Systems Introduction Two-tower Suggestion systems

0
Correct Sampling Bias for Recommender Systems
Introduction
Two-tower Suggestion systems

What’s sampling bias in advice, and the best way to correct them

Towards Data Science
Photo by NordWood Themes on Unsplash

Recommendations are ubiquitous in our digital lives, starting from e-commerce giants to streaming services. Nonetheless, hidden beneath every large recommender system lies a challenge that may significantly impact their effectiveness — sampling bias.

In this text, I’ll introduce how sampling bias occurs during training advice models and the way we are able to solve this issue in practice.

Let’s dive in!

On the whole, we are able to formulate the advice problem as follows: given query x (which might contain user information, context, previously clicked items, etc.), find the set of things {y1,.., yk} that the user will likely be focused on.

Certainly one of the principal challenges for large-scale recommender systems is low-latency requirements. Nonetheless, user and item pools are vast and dynamic, so scoring every candidate and greedily finding one of the best one is unattainable. Due to this fact, to satisfy the latency requirement, recommender systems are generally broken down into 2 principal stages: retrieval and rating.

Multi-stage recommender systems (Image by the writer)

Retrieval is an inexpensive and efficient method to quickly capture the highest item candidates (a number of hundred) from the vast candidate pool (thousands and thousands or billions). Retrieval optimization is especially about 2 objectives:

  • Through the training phase, we wish to encode users and items into embeddings that capture the user’s behaviour and preferences.
  • Through the inference, we wish to quickly retrieve relevant items through Approximate Nearest Neighbors (ANN).

For the primary objective, one in all the common approaches is the two-tower neural networks. The model gained its popularity for tackling the cold-start problems by incorporating item content features.

Intimately, queries and items are encoded by corresponding DNN towers in order that the relevant (query, item) embeddings stay…

LEAVE A REPLY

Please enter your comment!
Please enter your name here