CS246: Mining Massive Data Sets Winter 2020 CS246 is the first part in a two part sequence CS246--CS341. To support deeper explorations, most of the chapters are supplemented with further reading references. The importance of data to business decisions, strategy and behavior has proven unparalleled in recent years. The book is organised so a student can learn the fundamental ideas of probability from the first three chapters without reliance on calculus. The setting: ¡ Set of kchoices (arms) ¡ Each choice ais associated with unknown probability distribution P a supported in [0,1] ¡ We play the game for Trounds ¡ In each round t: § (1) We pick some arm a § (2)We obtain random sample X t from P a § Note reward is independent of previous draws ¡ Our goal is to maximize ∑ ¡ Problem: we don't know μ a!But every time we Companies place true value on individuals who understand and manipulate large data sets to provide informative outcomes. The book is based on Stanford Computer Science course CS246: Mining Massive Datasets (and CS345A: Data Mining). 