Section 3.4: Proximal Policy Optimization (PPO)