题目一
实验目的
用变分自编码器生成MNIST手写数字,实现以下要求:
推荐使用高斯分布随机初始化模型参数,可以避免一部分模式坍塌问题。
用变分自编码器生成MNIST手写数字,实现以下要求:
推荐使用高斯分布随机初始化模型参数,可以避免一部分模式坍塌问题。
"在框架如此完备的情况下,使用模型是俗手;设计模型是本手;而构造训练数据集,my friend,这才是妙手。"
最后要做成一个知识图谱类的RAG模型,做知识库训练的数据主要是论文。
If the actor to train and the actor for interacting is the same, we call it on-policy. In other words, if the actor himself does training to gain experience, it is on policy; if the actor gains experience by watching other actors train, it is off-policy.
Meta-learning means "learn to learn", and while normal deep learning learns the function f, meta-learning learns how to get the function f.
Reinforcement learning is unsupervised and it will solve the problem there is no standard answer(or humans also don't know the best way to solve).
All the purpose of the machine learning problem is to find a function, reinforcement learning is no exception.
这是一个对AMP(抗菌肽)做二元分类和回归的sota模型,基于pytorch。