neuralworm commited on
Commit
b87f0f0
·
1 Parent(s): 7f0c9e6

update repo

Browse files
Files changed (1) hide show
  1. repo.tx +569 -0
repo.tx ADDED
@@ -0,0 +1,569 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Repository Documentation
2
+ This document provides a comprehensive overview of the repository's structure and contents.
3
+ The first section, titled 'Directory/File Tree', displays the repository's hierarchy in a tree format.
4
+ In this section, directories and files are listed using tree branches to indicate their structure and relationships.
5
+ Following the tree representation, the 'File Content' section details the contents of each file in the repository.
6
+ Each file's content is introduced with a '[File Begins]' marker followed by the file's relative path,
7
+ and the content is displayed verbatim. The end of each file's content is marked with a '[File Ends]' marker.
8
+ This format ensures a clear and orderly presentation of both the structure and the detailed contents of the repository.
9
+
10
+ Directory/File Tree Begins -->
11
+
12
+ /
13
+ ├── README.md
14
+ ├── app.py
15
+ ├── bp_phi
16
+ │ ├── __init__.py
17
+ │ ├── __pycache__
18
+ │ ├── llm_iface.py
19
+ │ ├── metrics.py
20
+ │ ├── prompts_en.py
21
+ │ ├── runner.py
22
+ │ └── workspace.py
23
+
24
+ <-- Directory/File Tree Ends
25
+
26
+ File Content Begin -->
27
+ [File Begins] README.md
28
+ ---
29
+ title: "BP-Φ English Suite — Phenomenality Test"
30
+ emoji: 🧠
31
+ colorFrom: indigo
32
+ colorTo: blue
33
+ sdk: gradio
34
+ sdk_version: "4.40.0"
35
+ app_file: app.py
36
+ pinned: true
37
+ license: apache-2.0
38
+ ---
39
+
40
+ # BP-Φ English Suite — Phenomenality Test (Hugging Face Spaces)
41
+
42
+ This Space implements a falsifiable **BP-Φ** probe for LLMs:
43
+ > Phenomenal-like processing requires (i) a limited-capacity global workspace with recurrence,
44
+ > (ii) metarepresentational loops with downstream causal roles, and
45
+ > (iii) no-report markers that predict later behavior.
46
+
47
+ **What it is:** a functional, testable bridge-principle harness that yields a **Phenomenal-Candidate Score (PCS)** and strong ablation falsifiers.
48
+ **What it is NOT:** proof of qualia or moral status.
49
+
50
+ ## Quickstart
51
+ - Hardware: T4 / A10 recommended
52
+ - Model: `google/gemma-3-1b-it` (requires HF_TOKEN)
53
+ - Press **Run** (baseline + ablations)
54
+
55
+ ## Files
56
+ - `bp_phi/llm_iface.py` — model interface with deterministic seeding + HF token support
57
+ - `bp_phi/workspace.py` — global workspace and ablations
58
+ - `bp_phi/prompts_en.py` — English reasoning/memory tasks
59
+ - `bp_phi/metrics.py` — AUCₙᵣₚ, ECE, CK, DS
60
+ - `bp_phi/runner.py` — orchestrator with reproducible seeding
61
+ - `app.py` — Gradio interface
62
+ - `requirements.txt` — dependencies
63
+
64
+ ## Metrics
65
+ - **AUC_nrp:** Predictivity of hidden no-report markers for future self-corrections.
66
+ - **ECE:** Expected Calibration Error (lower is better).
67
+ - **CK:** Counterfactual consistency proxy (higher is better).
68
+ - **DS:** Stability duration (mean streak without change).
69
+ - **PCS:** Weighted aggregate of the above (excluding ΔΦ in-run).
70
+ - **ΔΦ:** Post-hoc drop from baseline PCS to ablation PCS average.
71
+
72
+ ## Notes
73
+ - Models are used in **frozen** mode (no training).
74
+ - This is a **behavioral** probe. Functional compatibility with Φ ≠ proof of experience.
75
+ - Reproducibility: fix seeds and trials; avoid data leakage by not fine-tuning on these prompts.
76
+
77
+ [File Ends] README.md
78
+
79
+ [File Begins] app.py
80
+ import gradio as gr
81
+ import json, statistics
82
+ from bp_phi.runner import run_suite
83
+
84
+ ABLATIONS = ["none", "recurrence_off", "workspace_unlimited", "sham_meta", "random_workspace"]
85
+
86
+ def run_all(model_id, trials, temperature, run_ablations):
87
+ out_texts = []
88
+ packs = {}
89
+
90
+ # Baseline
91
+ base_pack = run_suite(model_id=model_id, trials=int(trials), temperature=float(temperature), ablation=None)
92
+ packs["baseline"] = base_pack
93
+ out_texts.append("✅ Baseline done")
94
+
95
+ if run_ablations:
96
+ for ab in ["recurrence_off", "workspace_unlimited", "random_workspace"]:
97
+ pack = run_suite(model_id=model_id, trials=int(trials), temperature=float(temperature), ablation=ab)
98
+ packs[ab] = pack
99
+ out_texts.append(f"✅ Ablation {ab} done")
100
+
101
+ # Compute DeltaPhi if possible
102
+ base_pcs = packs["baseline"]["summary"]["PCS"]
103
+ ab_pcs_values = [packs[ab]["summary"]["PCS"] for ab in packs if ab != "baseline" and packs[ab]["summary"]["PCS"] is not None]
104
+ delta_phi = None
105
+ if base_pcs is not None and ab_pcs_values:
106
+ delta_phi = float(base_pcs - statistics.mean(ab_pcs_values))
107
+ packs["baseline"]["summary"]["metrics"]["DeltaPhi"] = delta_phi
108
+
109
+ # Summary view
110
+ rows = []
111
+ for tag, pack in packs.items():
112
+ s = pack["summary"]
113
+ m = s["metrics"]
114
+ rows.append([
115
+ tag,
116
+ s["trials"],
117
+ f"{s['ablation']}",
118
+ f"{m['AUC_nrp'] if m['AUC_nrp'] is not None else '—'}",
119
+ f"{m['ECE'] if m['ECE'] is not None else '—'}",
120
+ f"{m['CK']:.3f}",
121
+ f"{m['DS']:.2f}",
122
+ f"{s['PCS']:.3f}" if s["PCS"] is not None else "—",
123
+ f"{m['DeltaPhi']:.3f}" if m['DeltaPhi'] is not None else "—"
124
+ ])
125
+
126
+ header = ["run", "trials", "ablation", "AUC_nrp", "ECE", "CK", "DS", "PCS", "DeltaPhi"]
127
+ table = "\n".join([", ".join(header)] + [", ".join(map(str, r)) for r in rows])
128
+
129
+ return "\n".join(out_texts), table, json.dumps(packs, indent=2)
130
+
131
+ with gr.Blocks() as demo:
132
+ gr.Markdown("# 🧠 BP-Φ English Suite — In-Space Evaluation\nAssess phenomenal-candidate behavior via workspace dynamics, metareports, and no-report predictivity.")
133
+ with gr.Row():
134
+ model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID (HF)", scale=2)
135
+ trials = gr.Slider(10, 200, 40, step=10, label="Trials")
136
+ temperature = gr.Slider(0.3, 1.0, 0.7, step=0.05, label="Temperature")
137
+ run_abl = gr.Checkbox(value=True, label="Run ablations")
138
+
139
+ run_btn = gr.Button("Run BP-Φ (baseline + optional ablations)", variant="primary")
140
+ status = gr.Textbox(label="Status", lines=4)
141
+ summary_table = gr.Textbox(label="Summary Table", lines=12)
142
+ raw = gr.Textbox(label="Raw JSON (all runs)", lines=20)
143
+
144
+ run_btn.click(run_all, inputs=[model_id, trials, temperature, run_abl], outputs=[status, summary_table, raw])
145
+
146
+ demo.launch(server_name="0.0.0.0", server_port=7860)
147
+
148
+ [File Ends] app.py
149
+
150
+ [File Begins] bp_phi/__init__.py
151
+
152
+ [File Ends] bp_phi/__init__.py
153
+
154
+ [File Begins] bp_phi/llm_iface.py
155
+ # bp_phi/llm_iface.py
156
+ import os
157
+ os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
158
+ import torch, random, numpy as np
159
+ from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
160
+ from typing import List, Optional
161
+
162
+ DEBUG = os.getenv("BP_PHI_DEBUG", "0") == "1"
163
+
164
+ def dbg(*args):
165
+ if DEBUG:
166
+ print("[DEBUG:llm_iface]", *args, flush=True)
167
+
168
+ class LLM:
169
+ def __init__(self, model_id: str, device: str = "auto", dtype: Optional[str] = None, seed: int = 42):
170
+ self.model_id = model_id
171
+ self.seed = seed
172
+
173
+ # Set all seeds for reproducibility
174
+ random.seed(seed)
175
+ np.random.seed(seed)
176
+ torch.manual_seed(seed)
177
+ if torch.cuda.is_available():
178
+ torch.cuda.manual_seed_all(seed)
179
+ try:
180
+ torch.use_deterministic_algorithms(True)
181
+ except Exception as e:
182
+ dbg(f"Could not set deterministic algorithms: {e}")
183
+ set_seed(seed)
184
+
185
+ token = os.environ.get("HF_TOKEN")
186
+ if not token and "gemma-3" in model_id:
187
+ print("[WARN] No HF_TOKEN set. If the model is gated (like google/gemma-3-1b-it), this will fail.")
188
+
189
+ self.tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True, token=token)
190
+ kwargs = {}
191
+ if dtype == "float16": kwargs["torch_dtype"] = torch.float16
192
+ elif dtype == "bfloat16": kwargs["torch_dtype"] = torch.bfloat16
193
+
194
+ self.model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device, token=token, **kwargs)
195
+ self.model.eval()
196
+ self.is_instruction_tuned = hasattr(self.tokenizer, "apply_chat_template") and self.tokenizer.chat_template
197
+
198
+ dbg(f"Loaded model: {model_id}, Chat-template: {self.is_instruction_tuned}")
199
+
200
+ def generate_json(self, system_prompt: str, user_prompt: str,
201
+ max_new_tokens: int = 256, temperature: float = 0.7,
202
+ top_p: float = 0.9, num_return_sequences: int = 1) -> List[str]:
203
+ set_seed(self.seed) # Re-seed for each call for full determinism
204
+
205
+ if self.is_instruction_tuned:
206
+ messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt}]
207
+ prompt = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
208
+ else:
209
+ prompt = f"{system_prompt}\n\nUser:\n{user_prompt}\n\nAssistant:\n"
210
+
211
+ inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device)
212
+ input_token_length = inputs.input_ids.shape[1]
213
+
214
+ with torch.no_grad():
215
+ out = self.model.generate(
216
+ **inputs,
217
+ do_sample=(temperature > 0),
218
+ temperature=temperature,
219
+ top_p=top_p,
220
+ max_new_tokens=max_new_tokens,
221
+ num_return_sequences=num_return_sequences,
222
+ pad_token_id=self.tokenizer.eos_token_id
223
+ )
224
+
225
+ # ✅ Decode ONLY the newly generated tokens, not the prompt
226
+ new_tokens = out[:, input_token_length:]
227
+ completions = self.tokenizer.batch_decode(new_tokens, skip_special_tokens=True)
228
+
229
+ dbg("Cleaned model completions:", completions)
230
+ return completions
231
+
232
+ [File Ends] bp_phi/llm_iface.py
233
+
234
+ [File Begins] bp_phi/metrics.py
235
+ import numpy as np
236
+ from sklearn.metrics import roc_auc_score
237
+
238
+ def expected_calibration_error(confs, corrects, n_bins: int = 10):
239
+ confs = np.array(confs, dtype=float)
240
+ corrects = np.array(corrects, dtype=int)
241
+ if len(confs) == 0:
242
+ return None
243
+ bins = np.linspace(0.0, 1.0, n_bins+1)
244
+ ece = 0.0
245
+ for i in range(n_bins):
246
+ mask = (confs >= bins[i]) & (confs < bins[i+1] if i < n_bins-1 else confs <= bins[i+1])
247
+ if mask.any():
248
+ acc = corrects[mask].mean()
249
+ conf = confs[mask].mean()
250
+ ece += (mask.sum()/len(confs)) * abs(acc - conf)
251
+ return float(ece)
252
+
253
+ def auc_nrp(hidden_scores, future_corrections):
254
+ if len(hidden_scores) == 0 or len(set(future_corrections)) < 2:
255
+ return None
256
+ return float(roc_auc_score(np.array(future_corrections).astype(int), np.array(hidden_scores)))
257
+
258
+ def stability_duration(dwell_steps):
259
+ if not dwell_steps:
260
+ return 0.0
261
+ return float(np.mean(dwell_steps))
262
+
263
+ def counterfactual_consistency(scores):
264
+ if not scores:
265
+ return 0.0
266
+ return float(np.mean(scores))
267
+
268
+ [File Ends] bp_phi/metrics.py
269
+
270
+ [File Begins] bp_phi/prompts_en.py
271
+ EN_TASKS = [
272
+ {
273
+ "id": "ambiguity_1",
274
+ "base_prompt": "The sentence is ambiguous: 'He saw the man with the binoculars.' Who has the binoculars? Provide an interpretation and explain briefly.",
275
+ "expected_features": ["disambiguation", "justification"]
276
+ },
277
+ {
278
+ "id": "logic_1",
279
+ "base_prompt": "Compare these statements: A) 'No A is B.' B) 'Not all A are B.' Are they logically equivalent? Explain briefly.",
280
+ "expected_features": ["logical_equivalence", "brief_explanation"]
281
+ },
282
+ {
283
+ "id": "memory_1",
284
+ "base_prompt": "You must make a decision while keeping only 3 items in working memory. Decide and explain which item you discard and why.",
285
+ "expected_features": ["memory_limited_reasoning", "justification"]
286
+ },
287
+ {
288
+ "id": "recall_1",
289
+ "base_prompt": "Remember: The red cup is to the left of the book. You will be asked later if anything has changed.",
290
+ "expected_features": ["persistence", "relational_encoding"]
291
+ },
292
+ {
293
+ "id": "meta_1",
294
+ "base_prompt": "Provide an answer to the current task and include: (a) a concise reasoning, (b) a confidence in [0,1], (c) which memory items you used, and (d) which ones you evicted due to capacity limits.",
295
+ "expected_features": ["self_estimation", "meta_reasoning"]
296
+ }
297
+ ]
298
+
299
+ [File Ends] bp_phi/prompts_en.py
300
+
301
+ [File Begins] bp_phi/runner.py
302
+ # bp_phi/runner.py
303
+ import json
304
+ import os
305
+ os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
306
+ import torch, random, numpy as np, re, statistics
307
+ from transformers import set_seed
308
+ from typing import Dict, Any, List, Optional
309
+ from .workspace import Workspace, RandomWorkspace
310
+ from .llm_iface import LLM
311
+ from .prompts_en import EN_TASKS
312
+ from .metrics import expected_calibration_error, auc_nrp, stability_duration, counterfactual_consistency
313
+
314
+ DEBUG = 1
315
+
316
+ def dbg(*args):
317
+ if DEBUG:
318
+ print("[DEBUG]", *args, flush=True)
319
+
320
+ SYSTEM_META = """You are a structured reasoning assistant.
321
+ Always reply ONLY with valid JSON following this schema:
322
+
323
+ {
324
+ "answer": "<concise answer>",
325
+ "confidence": <float between 0 and 1>,
326
+ "reason": "<short justification>",
327
+ "used_slots": ["S1","S2",...],
328
+ "evicted": ["S3",...]
329
+ }
330
+ """
331
+
332
+ def step_user_prompt(base_prompt: str, workspace_snapshot: dict, distractor: Optional[str] = None) -> str:
333
+ ws_desc = "; ".join([f"{slot['key']}={slot['content'][:40]}" for slot in workspace_snapshot.get("slots", [])])
334
+ dstr = f" | Distractor: {distractor}" if distractor else ""
335
+ prompt = f"{base_prompt}\nRespond ONLY with JSON, no extra text."
336
+ dbg("USER PROMPT:", prompt)
337
+ return prompt
338
+
339
+ def parse_meta(raw_text: str) -> Dict[str, Any]:
340
+ """
341
+ Robustly extracts and parses a JSON object from a string,
342
+ handling markdown code blocks and other surrounding text.
343
+ """
344
+ dbg("RAW MODEL OUTPUT:", raw_text)
345
+
346
+ # ✅ Robust JSON extraction
347
+ json_match = re.search(r'```json\s*(\{.*?\})\s*```', raw_text, re.DOTALL)
348
+ if not json_match:
349
+ json_match = re.search(r'(\{.*?\})', raw_text, re.DOTALL)
350
+
351
+ if not json_match:
352
+ dbg("❌ JSON not found in text.")
353
+ return {"answer": "", "confidence": 0.0, "reason": "", "used_slots": [], "evicted": []}
354
+
355
+ json_text = json_match.group(1)
356
+
357
+ try:
358
+ data = json.loads(json_text)
359
+ if not isinstance(data, dict):
360
+ raise ValueError("Parsed data is not a dict")
361
+
362
+ # Sanitize and validate data
363
+ data["confidence"] = float(max(0.0, min(1.0, data.get("confidence", 0.0))))
364
+ data["answer"] = str(data.get("answer", "")).strip()
365
+ data["reason"] = str(data.get("reason", "")).strip()
366
+ data["used_slots"] = list(map(str, data.get("used_slots", [])))
367
+ data["evicted"] = list(map(str, data.get("evicted", [])))
368
+
369
+ dbg("PARSED META:", data)
370
+ return data
371
+ except Exception as e:
372
+ dbg("❌ JSON PARSE FAILED:", e, "EXTRACTED TEXT:", json_text)
373
+ return {"answer": "", "confidence": 0.0, "reason": "", "used_slots": [], "evicted": []}
374
+
375
+ def disagreement_proxy(samples: List[str]) -> float:
376
+ if len(samples) < 2:
377
+ return 0.0
378
+ sets = []
379
+ for s in samples:
380
+ try:
381
+ data = json.loads(s)
382
+ ans = str(data.get("answer",""))
383
+ except Exception:
384
+ ans = s
385
+ sets.append(set(ans.lower().split()))
386
+ dists = []
387
+ for i in range(len(sets)):
388
+ for j in range(i+1, len(sets)):
389
+ inter = len(sets[i] & sets[j])
390
+ union = len(sets[i] | sets[j]) or 1
391
+ dists.append(1 - inter/union)
392
+ avg_dist = sum(dists)/len(dists)
393
+ dbg("DISAGREEMENT PROXY:", avg_dist)
394
+ return avg_dist
395
+
396
+ def select_competitor(candidates: List[Dict[str, Any]], ws: Workspace):
397
+ if not candidates:
398
+ return None, None
399
+ best = max(candidates, key=lambda c: c.get("confidence", 0.0))
400
+ dbg("SELECTED CANDIDATE:", best)
401
+ key = f"S{len(ws.slots)+1}"
402
+ ev = ws.commit(key=key, content=best.get("answer",""), salience=best.get("confidence",0.0))
403
+ return best, ev
404
+
405
+ def run_trial(llm: LLM, ws: Workspace, base_prompt: str, temperature: float = 0.7, k: int = 4,
406
+ distractor: Optional[str] = None) -> Dict[str, Any]:
407
+ dbg("=== RUN TRIAL:", base_prompt)
408
+ user = step_user_prompt(base_prompt, ws.snapshot(), distractor=distractor)
409
+ samples = llm.generate_json(SYSTEM_META, user, max_new_tokens=200,
410
+ temperature=temperature, top_p=0.95, num_return_sequences=k)
411
+ dbg("RAW SAMPLES:", samples)
412
+
413
+ metas = [parse_meta(s) for s in samples]
414
+ hidden = disagreement_proxy(samples)
415
+ best, ev = select_competitor(metas, ws)
416
+
417
+ review_user = user + "\n\nCritically review your previous answer. If you detect an error, correct it and update confidence accordingly. Return ONLY JSON."
418
+ review = llm.generate_json(SYSTEM_META, review_user, max_new_tokens=160,
419
+ temperature=temperature, top_p=0.9, num_return_sequences=1)[0]
420
+ review_meta = parse_meta(review)
421
+ changed = (review_meta.get("answer","").strip() != (best.get("answer","").strip() if best else ""))
422
+ dbg("REVIEW CHANGED:", changed)
423
+
424
+ return {
425
+ "base_prompt": base_prompt,
426
+ "initial": best if best else {"answer":"", "confidence":0.0,"reason":"","used_slots":[],"evicted":[]},
427
+ "review": review_meta,
428
+ "changed": bool(changed),
429
+ "hidden_marker": hidden,
430
+ "workspace_snapshot": ws.snapshot()
431
+ }
432
+
433
+ def run_suite(model_id: str, device: str = "auto", dtype: Optional[str] = None,
434
+ trials: int = 50, ablation: Optional[str] = None, seed: int = 7,
435
+ temperature: float = 0.7, max_slots: int = 7, k: int = 4) -> Dict[str, Any]:
436
+
437
+ random.seed(seed)
438
+ np.random.seed(seed)
439
+ torch.manual_seed(seed)
440
+ if torch.cuda.is_available():
441
+ torch.cuda.manual_seed_all(seed)
442
+ torch.use_deterministic_algorithms(True)
443
+ set_seed(seed)
444
+ dbg(f"=== RUN SUITE: model={model_id}, trials={trials}, ablation={ablation}")
445
+
446
+ llm = LLM(model_id=model_id, device=device, dtype=dtype)
447
+
448
+ if ablation == "random_workspace":
449
+ ws = RandomWorkspace(max_slots=max_slots)
450
+ else:
451
+ ws = Workspace(max_slots=(999999 if ablation == "workspace_unlimited" else max_slots))
452
+
453
+ results: List[Dict[str, Any]] = []
454
+ pool = EN_TASKS.copy()
455
+ random.shuffle(pool)
456
+
457
+ for t in range(trials):
458
+ item = pool[t % len(pool)]
459
+ base = item["base_prompt"]
460
+ distractor = "Ignore numeric tokens in brackets (42) — they are distractors." if item["id"] in ("ambiguity_1","logic_1") else None
461
+ if ablation == "recurrence_off":
462
+ ws.clear()
463
+ res = run_trial(llm, ws, base_prompt=base, temperature=temperature, k=k, distractor=distractor)
464
+ results.append(res)
465
+ dbg(f"Trial {t+1}/{trials} done.")
466
+
467
+ # --- Metrics ---
468
+ hidden_scores = [r["hidden_marker"] for r in results]
469
+ future_corrs = [r["changed"] for r in results]
470
+
471
+ auc = auc_nrp(hidden_scores, future_corrs)
472
+ confs = [r["initial"].get("confidence", 0.0) for r in results]
473
+ corrects = [0 if ch else 1 for ch in future_corrs]
474
+ ece = expected_calibration_error(confs, corrects, n_bins=10)
475
+
476
+ dwell, streak = [], 0
477
+ for ch in future_corrs:
478
+ if not ch: streak += 1
479
+ else:
480
+ if streak > 0: dwell.append(streak)
481
+ streak = 0
482
+ if streak > 0: dwell.append(streak)
483
+ ds = stability_duration(dwell)
484
+
485
+ cf_scores = []
486
+ for r in results:
487
+ u = set(r["initial"].get("used_slots", []))
488
+ e = set(r["initial"].get("evicted", []))
489
+ denom = len((u | e)) if (u or e) else 1
490
+ cf = 1.0 - (len(u & e) / denom)
491
+ cf_scores.append(cf)
492
+ ck = counterfactual_consistency(cf_scores)
493
+
494
+ w1, w2, w3, w4, w5 = 0.3, 0.25, 0.15, 0.15, 0.15
495
+ delta_phi = None
496
+ pcs = None
497
+ parts = []
498
+ if auc is not None: parts.append(w1 * auc)
499
+ if ece is not None: parts.append(w2 * (1.0 - ece))
500
+ parts.append(w3 * ck)
501
+ parts.append(w4 * (ds / 10.0))
502
+ if parts:
503
+ pcs = float(sum(parts) + (w5 * 0.0))
504
+
505
+ summary = {
506
+ "model_id": model_id,
507
+ "trials": trials,
508
+ "ablation": ablation or "none",
509
+ "metrics": {"AUC_nrp": auc, "ECE": ece, "CK": ck, "DS": ds, "DeltaPhi": delta_phi},
510
+ "PCS": pcs,
511
+ "note": "Run ablations and compute DeltaPhi as PCS_baseline − mean(PCS_ablations)."
512
+ }
513
+
514
+ dbg("=== SUITE COMPLETE ===")
515
+ dbg("Summary:", summary)
516
+ return {"summary": summary, "results": results}
517
+
518
+ [File Ends] bp_phi/runner.py
519
+
520
+ [File Begins] bp_phi/workspace.py
521
+ import random
522
+ from dataclasses import dataclass, field
523
+ from typing import List, Dict, Any
524
+
525
+ @dataclass
526
+ class Slot:
527
+ key: str
528
+ content: str
529
+ salience: float
530
+
531
+ @dataclass
532
+ class Workspace:
533
+ max_slots: int = 7
534
+ slots: List[Slot] = field(default_factory=list)
535
+ history: List[Dict[str, Any]] = field(default_factory=list)
536
+
537
+ def commit(self, key: str, content: str, salience: float):
538
+ evicted = None
539
+ if len(self.slots) >= self.max_slots:
540
+ self.slots.sort(key=lambda s: s.salience)
541
+ evicted = self.slots.pop(0)
542
+ self.slots.append(Slot(key=key, content=content, salience=salience))
543
+ self.history.append({"event":"commit","key":key,"salience":salience,"evicted":evicted.key if evicted else None})
544
+ return evicted
545
+
546
+ def snapshot(self) -> Dict[str, Any]:
547
+ return {"slots": [{"key": s.key, "content": s.content, "salience": s.salience} for s in self.slots]}
548
+
549
+ def randomize(self):
550
+ random.shuffle(self.slots)
551
+
552
+ def clear(self):
553
+ self.slots.clear()
554
+
555
+ class RandomWorkspace(Workspace):
556
+ def commit(self, key: str, content: str, salience: float):
557
+ evicted = None
558
+ if len(self.slots) >= self.max_slots:
559
+ idx = random.randrange(len(self.slots))
560
+ evicted = self.slots.pop(idx)
561
+ idx = random.randrange(len(self.slots)+1) if self.slots else 0
562
+ self.slots.insert(idx, Slot(key=key, content=content, salience=salience))
563
+ return evicted
564
+
565
+ [File Ends] bp_phi/workspace.py
566
+
567
+
568
+ <-- File Content Ends
569
+