Dongwei
/

Rationalyst_reasoning_datasets

Text Generation

feature-extraction

text-generation-inference

Model card Files Files and versions

Dongwei commited on Jul 11, 2024

Commit

4d16990

·

verified ·

1 Parent(s): 74d3f8c

Update README.md

Files changed (1) hide show

README.md +23 -1

README.md CHANGED Viewed

@@ -6,6 +6,28 @@ license: apache-2.0
 # Rationalyst (with rationales extracted from reasoning datasets)
-This model is a distilled version of the [LLaMa-3-Instruct-8B](https://huggingface.co/bert-base-uncased). It was
 introduced in RATIONALYST: Supervising Reasoning via Self-Supervised Rationale Extraction. The code for the rationale extraction, model training and
 inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).

 # Rationalyst (with rationales extracted from reasoning datasets)
+This model is a fine-tuned version of the [LLaMa-3-Instruct-8B](https://huggingface.co/bert-base-uncased). It was
 introduced in RATIONALYST: Supervising Reasoning via Self-Supervised Rationale Extraction. The code for the rationale extraction, model training and
 inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).
+## Model description
+Implicit rationales are often embedded in the unlabelled text, reflecting the natural thought processes behind speech and writing.
+RATIONALYST is a self-supervised approach to extract and filter these implicit rationales from unlabelled text and apply
+them to supervise reasoning.
+## How to use
+To use it, simply input question and partial reasoning trajectory, and the model will output the rationale to supervise the next reasoning step.
+## Training data
+This Rationalyst is trained using 17566 rationales from GSM8K and 19669 rationales from ECQA. The data used can be found
+[here](https://huggingface.co/datasets/Dongwei/reasoning_world_model)
+## Evaluation results
+When used to evaluate on downstream tasks, this model achieves the following results:
+| Task | GSM8K | MATH  | ECQA | HellaSwag | ProofWriter  | ARC | MMLU-Pro |
+|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|
+|      | 76.2 | 32.5   | 76.2 | 59.4     | 90.1 | 79.3        | 32.1     |