policy-gradient-methods
Master REINFORCE, PPO, TRPO - direct policy optimization with trust regions
$ 安裝
git clone https://github.com/tachyon-beep/hamlet /tmp/hamlet && cp -r /tmp/hamlet/.claude/skills/yzmir-deep-rl/skills/policy-gradient-methods ~/.claude/skills/hamlet// tip: Run this command in your terminal to install the skill
Repository

tachyon-beep
Author
tachyon-beep/hamlet/.claude/skills/yzmir-deep-rl/skills/policy-gradient-methods
0
Stars
0
Forks
Updated1w ago
Added1w ago