Alexey Gorbatovski's picture

3 7

Alexey Gorbatovski

Myashka

·

Myashka

AI & ML interests

NLP Alignment

Recent Activity

commented on a paper 9 days ago

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

new activity 21 days ago

agentica-org/DeepScaleR-Preview-Dataset:There are no answers for 6 samples

updated a model 2 months ago

Myashka/Qwen2.5-7B-UltraChat200K_EMA_SFT-Lr_3e_6-Alpha_0.01

View all activity

Organizations

None yet

Myashka 's models 37

Myashka/125M_GPTneo-ppo_tuned-max_reward

Text Generation • Updated Apr 30, 2023

Myashka/125M_GPTneo-ppo_tuned-last_epoch

Text Generation • Updated Apr 30, 2023 • 1

Myashka/125M_GPTneo_reward_gen

Text Classification • Updated Apr 24, 2023 • 1

Myashka/125M_GPTneo_reward_base

Text Classification • Updated Apr 22, 2023 • 1

Myashka/125M_GPTneo_sft_tuned

Text Generation • Updated Apr 22, 2023 • 2

Myashka/GPT_neo_python_QA

Updated Sep 30, 2022

Myashka/emotion-hi

Updated May 22, 2022