Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
Distributional Correction
Technique correcting the mismatch between the distribution of offline visited state-actions and that generated by the learned policy during online transfer.
Fitted Q-Iteration
Iterative offline learning algorithm approximating the optimal Q-function using regressors on batches of experimental data.
Safe Policy Transfer
Strategy ensuring that policies transferred from offline to online maintain minimal performance during the initial adaptation phase.
Dataset Aggregation
Iterative method collecting and aggregating successive offline data to progressively improve policy performance before online deployment.
Offline Policy Evaluation
Evaluation of policy performance without direct interaction with the environment, crucial for selecting the best policies to transfer online.
Transfer Learning Gap
Quantitative measure of the performance difference between an offline-trained policy and its initial performance in an online environment.