policy-gradient-methods

Master REINFORCE, PPO, TRPO - direct policy optimization with trust regions

$ 安裝

git clone https://github.com/tachyon-beep/hamlet /tmp/hamlet && cp -r /tmp/hamlet/.claude/skills/yzmir-deep-rl/skills/policy-gradient-methods ~/.claude/skills/hamlet

// tip: Run this command in your terminal to install the skill

Repository

tachyon-beep
tachyon-beep
Author
tachyon-beep/hamlet/.claude/skills/yzmir-deep-rl/skills/policy-gradient-methods
0
Stars
0
Forks
Updated1w ago
Added1w ago