❯

❯

Contextual Bandit

Contextual Bandit

Mar 01, 20121 min read

recommenders

Notable References

Reinforcement learning with immediate rewards and linear hypotheses
Explore/exploit schemes for web content optimization
Online models for content optimization
Sample mean based index policies with o(log n) regret for the multi-armed bandit problem
Just-in-time contextual advertising
Using confidence bounds for exploitation-exploration trade-offs
Finite-time analysis of the multi-armed bandit problem
The non-stochastic multi-armed bandit problem
Bandit Problems: Sequential Allocation of Experiments
The Adaptive Web — Methods and Strategies of Web Personalization
Hybrid systems for personalized recommendations
Personalized recommendation on dynamic content using predictive bilinear models
A case study of behavior-driven conjoint analysis on Yahoo!: Front Page Today Module
Google news personalization: scalable online collaborative filtering
Bandit processes and dynamic allocation indices
Efficient bandit algorithms for online multi-class prediction
Asymptotically efficient adaptive allocation rules
The epoch-greedy algorithm for contextual multi-armed bandits
Information Theory, Inference, and Learning Algorithms
Text-learning and related intelligent agents: A survey
Naïve filterbots for robust cold-start recommendations
Simulation studies of multi-armed bandits with covariates
Eligibility traces for off-policy policy evaluation
Some aspects of the sequential design of experiments
Recommender systems in e-commerce
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
Exploring compact reinforcement-learning representations with linear regression

Graph View

Backlinks

No backlinks found