Strategic Experimentation with Erlang Bandits
Risks related to events that arrive randomly play important role in many real life decisions, and models of learning and experimentation based on two-armed Poisson bandits addressed several important aspects related to strategic and motivational learning in cases when events arrive at jumps times of the standard Poisson process. At the same time, these models remain mostly abstract theoretical models with few direct economic applications. We suggest a new class of models of strategic experimentation which are almost as tractable as exponential models, but incorporate such realistic features as dependence of the expected rate of news arrival on the time elapsed since the start of an experiment and judgement about the quality of a \risky" arm based on evidence of a series of trials as opposed to a single evidence of success or failure as in exponential models with conclusive experiments. We demonstrate that, unlike in the exponential models, players may stop experimentation before the first breakdown happens. Moreover, ceteris paribus, experimentation in a model with breakthroughs may last longer than experimentation in the corresponding model with breakdowns.