Online Optimization
Learning with Partial Information
Paradigm where the algorithm only receives information about the chosen action (bandit) rather than all possible actions (full information).
← Back