Spaces:

crossentropy-ai
/

rlcube

Sleeping

imwithye commited on Sep 21

Commit

67ab4fa

1 Parent(s): 1aa9298

format

Files changed (2) hide show

README.md CHANGED Viewed

@@ -14,21 +14,21 @@ Solve the Rubik's Cube using Reinforcement Learning! 🚀
 ## 🏋️‍♂️ Train the Model
 1. Navigate to the `rlcube` directory:
-    ```
-    cd rlcube
-    ```
 2. Install dependencies:
-    ```
-    uv sync
-    ```
 3. Activate the virtual environment:
-    ```
-    source .venv/bin/activate
-    ```
 4. Start training:
-    ```
-    python -m rlcube.train.train
-    ```
 After training, your model will be saved in the `models` folder.
 Please rename the trained file to `model_final.pth` so it can be used by the API. 🎯

 ## 🏋️‍♂️ Train the Model
 1. Navigate to the `rlcube` directory:
+   ```
+   cd rlcube
+   ```
 2. Install dependencies:
+   ```
+   uv sync
+   ```
 3. Activate the virtual environment:
+   ```
+   source .venv/bin/activate
+   ```
 4. Start training:
+   ```
+   python -m rlcube.train.train
+   ```
 After training, your model will be saved in the `models` folder.
 Please rename the trained file to `model_final.pth` so it can be used by the API. 🎯

rlcube/rlcube/train/train.py CHANGED Viewed

@@ -59,6 +59,7 @@ def train(epochs: int = 100):
                 target_values = target_values.detach()
                 indices = indices.reshape(-1)
                 weights = 1 / D.reshape(-1).detach()
             loss_v = value_loss_fn(values, target_values).reshape(-1) * weights

                 target_values = target_values.detach()
                 indices = indices.reshape(-1)
+                indices = indices * masks.reshape(-1)
                 weights = 1 / D.reshape(-1).detach()
             loss_v = value_loss_fn(values, target_values).reshape(-1) * weights