2 posts tagged with "强化学习"

强化学习 - 基本组件

September 7, 2024 · 8 min read

AI, CVer, Pythoner, Half-stack Developer

info

The main characters of RL are the agent and the environment. The environment is the world that the agent lives in and interacts with. At every step of interaction, the agent sees a (possibly partial) observation of the state of the world, and then decides on an action to take. The environment changes when the agent acts on it, but may also change on its own.

The agent also perceives a reward signal from the environment, a number that tells it how good or bad the current world state is. The goal of the agent is to maximize its cumulative reward, called return. Reinforcement learning methods are ways that the agent can learn behaviors to achieve its goal.

Reparameterization Trick

November 5, 2023 · 5 min read

PuQing

AI, CVer, Pythoner, Half-stack Developer

Motivation

假设我们有个在参数 $\theta$ 下的正态分布 $q$ 。我们想要求解下面这样一个问题

\min_{\theta} E_{q}[f(x)]

其中 $E_{q}[f(x)]$ 的意思是求满足 $q$ 分布下的随机变量函数 $f(x)$ 的均值，而最外层的 $\min_{\theta}$ 则是求使得该均值最小时的 $\theta$

Motivation​

Motivation