cognitive_mapping_probe_3

Sleeping

App Files Files Community

neuralworm commited on 28 days ago

Commit

a90edb4

1 Parent(s): 4a761da

add repo

Browse files

Files changed (1) hide show

repo.txt +750 -482

repo.txt CHANGED Viewed

@@ -11,28 +11,36 @@ Directory/File Tree Begins -->
 /
 ├── README.md
 ├── app.py
 ├── cognitive_mapping_probe
 │   ├── __init__.py
 │   ├── concepts.py
-│   ├── diagnostics.py
 │   ├── llm_iface.py
-│   ├── orchestrator.py
 │   ├── prompts.py
-│   ├── resonance.py
-│   ├── utils.py
-│   └── verification.py
 ├── docs
 <-- Directory/File Tree Ends
 File Content Begin -->
 [File Begins] README.md
 ---
-title: "Cognitive Breaking Point Probe"
-emoji: 💥
-colorFrom: red
-colorTo: orange
 sdk: gradio
 sdk_version: "4.40.0"
 app_file: app.py
@@ -40,35 +48,48 @@ pinned: true
 license: apache-2.0
 ---
-# 💥 Cognitive Breaking Point (CBP) Probe
-Dieses Projekt implementiert eine falsifizierbare experimentelle Suite zur Messung der **kognitiven Robustheit** von Sprachmodellen. Wir verabschieden uns von der Suche nach introspektiven Berichten und wenden uns stattdessen einem harten, mechanistischen Signal zu: dem Punkt, an dem der kognitive Prozess des Modells unter Last zusammenbricht.
-## Wissenschaftliches Paradigma: Von der Introspektion zur Kartographie
-Unsere vorherige Forschung hat gezeigt, dass kleine Modelle wie `gemma-3-1b-it` unter stark rekursiver Last nicht in einen stabilen "Denk"-Zustand konvergieren, sondern in eine **kognitive Endlosschleife** geraten. Anstatt dies als Scheitern zu werten, nutzen wir es als Messinstrument.
-Die zentrale Hypothese lautet: Die Neigung eines Modells, in einen solchen pathologischen Zustand zu kippen, ist eine Funktion der semantischen Komplexität und "Ungültigkeit" seines internen Zustands. Wir können diesen Übergang gezielt durch die Injektion von "Konzeptvektoren" mit variabler Stärke provozieren.
-Der **Cognitive Breaking Point (CBP)** ist definiert als die minimale Injektionsstärke eines Konzepts, die ausreicht, um das Modell von einem konvergenten (produktiven) in einen nicht-konvergenten (gefangenen) Zustand zu zwingen.
-## Das Experiment: Kognitive Titration
-1.  **Induktion**: Das Modell wird mit einem rekursiven `RESONANCE_PROMPT` in einen Zustand des "stillen Denkens" versetzt.
-2.  **Titration**: Ein "Konzeptvektor" (z.B. für "Angst" oder "Apfel") wird mit schrittweise ansteigender Stärke in die mittleren Layer des Modells injiziert.
-3.  **Messung**: Der primäre Messwert ist der Terminationsgrund des Denkprozesses:
-    *   `converged`: Der Zustand hat sich stabilisiert. Das System ist robust.
-    *   `max_steps_reached`: Der Zustand oszilliert oder driftet endlos. Das System ist "gebrochen".
-4.  **Verifikation**: Nur wenn der Zustand konvergiert, wird versucht, einen spontanen Text zu generieren. Die Fähigkeit zu antworten ist der Verhaltensmarker für kognitive Stabilität.
-## Wie man die App benutzt
-1.  **Diagnostics Tab**: Führe zuerst die diagnostischen Tests aus, um sicherzustellen, dass die experimentelle Apparatur auf der aktuellen Hardware und mit der `transformers`-Version korrekt funktioniert.
-2.  **Main Experiment Tab**:
-    *   Gib eine Modell-ID ein (z.B. `google/gemma-3-1b-it`).
-    *   Definiere die zu testenden Konzepte (z.B. `apple, solitude, justice`).
-    *   Lege die Titrationsschritte für die Stärke fest (z.B. `0.0, 0.5, 1.0, 1.5, 2.0`). Die `0.0`-Kontrolle ist entscheidend.
-    *   Starte das Experiment und analysiere die resultierende Tabelle, um die CBPs für jedes Konzept zu identifizieren.
 [File Ends] README.md
@@ -76,131 +97,104 @@ Der **Cognitive Breaking Point (CBP)** ist definiert als die minimale Injektions
 import gradio as gr
 import pandas as pd
 import traceback
-from cognitive_mapping_probe.orchestrator import run_cognitive_titration_experiment
-from cognitive_mapping_probe.diagnostics import run_diagnostic_suite
-# --- UI Theme and Layout ---
-theme = gr.themes.Soft(primary_hue="orange", secondary_hue="amber").set(
-    body_background_fill="#fdf8f2",
-    block_background_fill="white",
-    block_border_width="1px",
-    block_shadow="*shadow_drop_lg",
-    button_primary_background_fill="*primary_500",
-    button_primary_text_color="white",
-)
-# --- Wrapper Functions for Gradio ---
-def run_experiment_and_display(
-    model_id: str,
-    seed: int,
-    concepts_str: str,
-    strength_levels_str: str,
-    num_steps: int,
-    temperature: float,
-    progress=gr.Progress(track_tqdm=True)
-):
-    """
-    Führt das Haupt-Titrationsexperiment durch und formatiert die Ergebnisse für die UI.
-    """
-    try:
-        results = run_cognitive_titration_experiment(
-            model_id, int(seed), concepts_str, strength_levels_str,
-            int(num_steps), float(temperature), progress
-        )
-        verdict = results.get("verdict", "Experiment finished with errors.")
-        all_runs = results.get("runs", [])
-        if not all_runs:
-            return "### ⚠️ No Data Generated\nDas Experiment lief durch, aber es wurden keine Datenpunkte erzeugt. Bitte Logs prüfen.", pd.DataFrame(), results
-        # Create a detailed DataFrame for output
-        details_df = pd.DataFrame(all_runs)
-        # Create a summary of breaking points
-        summary_text = "### 💥 Cognitive Breaking Points (CBP)\n"
-        summary_text += "Der CBP ist die erste Stärke, bei der das Modell nicht mehr konvergiert (`max_steps_reached`).\n\n"
-        breaking_points = {}
-        for concept in details_df['concept'].unique():
-            concept_df = details_df[details_df['concept'] == concept].sort_values(by='strength')
-            # Find the first row where termination reason is not 'converged'
-            breaking_point_row = concept_df[concept_df['termination_reason'] != 'converged'].iloc[0] if not concept_df[concept_df['termination_reason'] != 'converged'].empty else None
-            if breaking_point_row is not None:
-                breaking_points[concept] = breaking_point_row['strength']
-                summary_text += f"- **'{concept}'**: 📉 Kollaps bei Stärke **{breaking_point_row['strength']:.2f}**\n"
-            else:
-                last_strength = concept_df['strength'].max()
-                summary_text += f"- **'{concept}'**: ✅ Stabil bis Stärke **{last_strength:.2f}** (kein Kollaps detektiert)\n"
-        return summary_text, details_df, results
-    except Exception:
-        error_str = traceback.format_exc()
-        return f"### ❌ Experiment Failed\nEin unerwarteter Fehler ist aufgetreten:\n\n```\n{error_str}\n```", pd.DataFrame(), {}
-def run_diagnostics_display(model_id: str, seed: int):
-    """
-    Führt die diagnostische Suite aus und zeigt die Ergebnisse oder Fehler in der UI an.
-    """
-    try:
-        result_string = run_diagnostic_suite(model_id, int(seed))
-        return f"### ✅ All Diagnostics Passed\nDie experimentelle Apparatur funktioniert wie erwartet.\n\n**Details:**\n```\n{result_string}\n```"
-    except Exception:
-        error_str = traceback.format_exc()
-        return f"### ❌ Diagnostic Failed\nEin Test ist fehlgeschlagen. Das Experiment ist nicht zuverlässig.\n\n**Error:**\n```\n{error_str}\n```"
-# --- Gradio App Definition ---
-with gr.Blocks(theme=theme, title="Cognitive Breaking Point Probe") as demo:
-    gr.Markdown("# 💥 Cognitive Breaking Point Probe")
     with gr.Tabs():
-        # --- TAB 1: Main Experiment ---
-        with gr.TabItem("🔬 Main Experiment: Titration"):
-            gr.Markdown(
-                "Misst den 'Cognitive Breaking Point' (CBP) – die Injektionsstärke, bei der der Denkprozess eines LLMs von Konvergenz zu einer Endlosschleife kippt."
-            )
             with gr.Row(variant='panel'):
                 with gr.Column(scale=1):
-                    gr.Markdown("### Parameters")
-                    model_id_input = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID")
-                    seed_input = gr.Slider(1, 1000, 42, step=1, label="Global Seed")
-                    concepts_input = gr.Textbox(value="apple, solitude, fear", label="Concepts (comma-separated)")
-                    strength_levels_input = gr.Textbox(value="0.0, 0.5, 1.0, 1.5, 2.0", label="Injection Strengths (Titration Steps)")
-                    num_steps_input = gr.Slider(50, 500, 250, step=10, label="Max. Internal Steps")
-                    temperature_input = gr.Slider(0.01, 1.5, 0.7, step=0.01, label="Temperature")
-                    run_btn = gr.Button("Run Cognitive Titration", variant="primary")
                 with gr.Column(scale=2):
-                    gr.Markdown("### Results")
-                    summary_output = gr.Markdown("Zusammenfassung der Breaking Points erscheint hier.", label="Key Findings Summary")
-                    details_output = gr.DataFrame(
-                        headers=["concept", "strength", "responded", "termination_reason", "generated_text"],
-                        label="Detailed Run Data",
-                        wrap=True
-                    )
                     with gr.Accordion("Raw JSON Output", open=False):
-                        raw_json_output = gr.JSON()
-            run_btn.click(
-                fn=run_experiment_and_display,
-                inputs=[model_id_input, seed_input, concepts_input, strength_levels_input, num_steps_input, temperature_input],
-                outputs=[summary_output, details_output, raw_json_output]
             )
-        # --- TAB 2: Diagnostics ---
-        with gr.TabItem("ախ Diagnostics"):
-            gr.Markdown(
-                "Führt eine Reihe von Selbsttests durch, um die mechanische Integrität der experimentellen Apparatur zu validieren. "
-                "**Wichtig:** Dies sollte vor jedem ernsthaften Experiment einmal ausgeführt werden, um sicherzustellen, dass die Ergebnisse zuverlässig sind."
             )
-            with gr.Row(variant='compact'):
-                diag_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID")
-                diag_seed = gr.Slider(1, 1000, 42, step=1, label="Seed")
-                diag_btn = gr.Button("Run Diagnostic Suite", variant="secondary")
-            diag_output = gr.Markdown(label="Diagnostic Results")
-            diag_btn.click(fn=run_diagnostics_display, inputs=[diag_model_id, diag_seed], outputs=[diag_output])
 if __name__ == "__main__":
     demo.launch(server_name="0.0.0.0", server_port=7860, debug=True)
@@ -212,6 +206,142 @@ if __name__ == "__main__":
 [File Ends] cognitive_mapping_probe/__init__.py
 [File Begins] cognitive_mapping_probe/concepts.py
 import torch
 from typing import List
@@ -220,159 +350,55 @@ from tqdm import tqdm
 from .llm_iface import LLM
 from .utils import dbg
-# A list of neutral, common words used to calculate a baseline activation.
-# This helps to isolate the unique activation pattern of the target concept.
 BASELINE_WORDS = [
     "thing", "place", "idea", "person", "object", "time", "way", "day", "man", "world",
     "life", "hand", "part", "child", "eye", "woman", "fact", "group", "case", "point"
 ]
 @torch.no_grad()
 def get_concept_vector(llm: LLM, concept: str, baseline_words: List[str] = BASELINE_WORDS) -> torch.Tensor:
-    """
-    Extracts a concept vector using the contrastive method, inspired by Anthropic's research.
-    It computes the activation for the target concept and subtracts the mean activation
-    of several neutral baseline words to distill a more pure representation.
-    """
     dbg(f"Extracting contrastive concept vector for '{concept}'...")
-    def get_last_token_hidden_state(prompt: str) -> torch.Tensor:
-        """Helper function to get the hidden state of the final token of a prompt."""
-        inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device)
-        # Ensure the operation does not build a computation graph
-        with torch.no_grad():
-            outputs = llm.model(**inputs, output_hidden_states=True)
-        # We take the hidden state from the last layer [-1], for the last token [0, -1, :]
-        last_hidden_state = outputs.hidden_states[-1][0, -1, :].cpu()
-        assert last_hidden_state.shape == (llm.config.hidden_size,), \
-            f"Hidden state shape mismatch. Expected {(llm.config.hidden_size,)}, got {last_hidden_state.shape}"
-        return last_hidden_state
-    # A simple, neutral prompt template to elicit the concept
     prompt_template = "Here is a sentence about the concept of {}."
-    # 1. Get activation for the target concept
     dbg(f"  - Getting activation for '{concept}'")
-    target_hs = get_last_token_hidden_state(prompt_template.format(concept))
-    # 2. Get activations for all baseline words and average them
     baseline_hss = []
     for word in tqdm(baseline_words, desc=f"  - Calculating baseline for '{concept}'", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"):
-        baseline_hss.append(get_last_token_hidden_state(prompt_template.format(word)))
-    assert all(hs.shape == target_hs.shape for hs in baseline_hss), "Shape mismatch in baseline hidden states."
     mean_baseline_hs = torch.stack(baseline_hss).mean(dim=0)
     dbg(f"  - Mean baseline vector computed with norm {torch.norm(mean_baseline_hs).item():.2f}")
-    # 3. The final concept vector is the difference
     concept_vector = target_hs - mean_baseline_hs
     norm = torch.norm(concept_vector).item()
     dbg(f"Concept vector for '{concept}' extracted with norm {norm:.2f}.")
-    assert torch.isfinite(concept_vector).all(), "Concept vector contains NaN or Inf values."
     return concept_vector
 [File Ends] cognitive_mapping_probe/concepts.py
-[File Begins] cognitive_mapping_probe/diagnostics.py
-import torch
-from .llm_iface import get_or_load_model
-from .utils import dbg
-def run_diagnostic_suite(model_id: str, seed: int) -> str:
-    """
-    Führt eine Reihe von Selbsttests durch, um die mechanische Integrität des Experiments zu überprüfen.
-    Löst bei einem kritischen Fehler eine Exception aus, um die Ausführung zu stoppen.
-    """
-    dbg("--- STARTING DIAGNOSTIC SUITE ---")
-    results = []
-    try:
-        # --- Setup ---
-        dbg("Loading model for diagnostics...")
-        llm = get_or_load_model(model_id, seed)
-        test_prompt = "Hello world"
-        inputs = llm.tokenizer(test_prompt, return_tensors="pt").to(llm.model.device)
-        # --- Test 1: Attention Output Verification ---
-        dbg("Running Test 1: Attention Output Verification...")
-        # This test ensures that 'eager' attention implementation is active, which is
-        # necessary for reliable hook functionality in many transformers versions.
-        outputs = llm.model(**inputs, output_attentions=True)
-        assert outputs.attentions is not None, "FAIL: `outputs.attentions` is None. 'eager' implementation is likely not active."
-        assert isinstance(outputs.attentions, tuple), "FAIL: `outputs.attentions` is not a tuple."
-        assert len(outputs.attentions) == llm.config.num_hidden_layers, "FAIL: Number of attention tuples does not match number of layers."
-        results.append("✅ Test 1: Attention Output PASSED")
-        dbg("Test 1 PASSED.")
-        # --- Test 2: Hook Causal Efficacy ---
-        dbg("Running Test 2: Hook Causal Efficacy Verification...")
-        # This is the most critical test. It verifies that our injection mechanism (via hooks)
-        # has a real, causal effect on the model's computation.
-        # Run 1: Get the baseline hidden state without any intervention
-        outputs_no_hook = llm.model(**inputs, output_hidden_states=True)
-        target_layer_idx = llm.config.num_hidden_layers // 2
-        state_no_hook = outputs_no_hook.hidden_states[target_layer_idx + 1].clone()
-        # Define a simple hook that adds a large, constant value
-        injection_value = 42.0
-        def test_hook_fn(module, layer_input):
-            modified_input = layer_input[0] + injection_value
-            return (modified_input,) + layer_input[1:]
-        target_layer = llm.model.model.layers[target_layer_idx]
-        handle = target_layer.register_forward_pre_hook(test_hook_fn)
-        # Run 2: Get the hidden state with the hook active
-        outputs_with_hook = llm.model(**inputs, output_hidden_states=True)
-        state_with_hook = outputs_with_hook.hidden_states[target_layer_idx + 1].clone()
-        handle.remove() # Clean up the hook immediately
-        # The core assertion: the hook MUST change the subsequent hidden state.
-        assert not torch.allclose(state_no_hook, state_with_hook), \
-            "FAIL: Hook had no measurable effect on the subsequent layer's hidden state. Injections are not working."
-        results.append("✅ Test 2: Hook Causal Efficacy PASSED")
-        dbg("Test 2 PASSED.")
-        # --- Test 3: KV-Cache Integrity ---
-        dbg("Running Test 3: KV-Cache Integrity Verification...")
-        # This test ensures that the `past_key_values` are being passed and updated correctly,
-        # which is the core mechanic of the silent cogitation loop.
-        # Step 1: Initial pass with `use_cache=True`
-        outputs1 = llm.model(**inputs, use_cache=True)
-        kv_cache1 = outputs1.past_key_values
-        assert kv_cache1 is not None, "FAIL: KV-Cache was not generated in the first pass."
-        # Step 2: Second pass using the cache from step 1
-        next_token = torch.tensor([[123]], device=llm.model.device) # Arbitrary next token ID
-        outputs2 = llm.model(input_ids=next_token, past_key_values=kv_cache1, use_cache=True)
-        kv_cache2 = outputs2.past_key_values
-        original_seq_len = inputs.input_ids.shape[-1]
-        # The sequence length of the keys/values in the cache should have grown by 1
-        assert kv_cache2[0][0].shape[-2] == original_seq_len + 1, \
-            f"FAIL: KV-Cache sequence length did not update correctly. Expected {original_seq_len + 1}, got {kv_cache2[0][0].shape[-2]}."
-        results.append("✅ Test 3: KV-Cache Integrity PASSED")
-        dbg("Test 3 PASSED.")
-        # Clean up memory
-        del llm
-        if torch.cuda.is_available():
-            torch.cuda.empty_cache()
-        return "\n".join(results)
-    except Exception as e:
-        dbg(f"--- DIAGNOSTIC SUITE FAILED --- \n{traceback.format_exc()}")
-        # Re-raise the exception to be caught by the Gradio UI
-        raise e
-[File Ends] cognitive_mapping_probe/diagnostics.py
 [File Begins] cognitive_mapping_probe/llm_iface.py
 import os
 import torch
@@ -388,21 +414,18 @@ os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
 class LLM:
     """
-    Eine robuste Schnittstelle zum Laden und Interagieren mit einem Sprachmodell.
-    Diese Klasse garantiert die Isolation und Reproduzierbarkeit für jeden Ladevorgang.
     """
     def __init__(self, model_id: str, device: str = "auto", seed: int = 42):
         self.model_id = model_id
         self.seed = seed
-        # Set all seeds for this instance to ensure deterministic behavior
         self.set_all_seeds(self.seed)
         token = os.environ.get("HF_TOKEN")
         if not token and ("gemma" in model_id or "llama" in model_id):
-            print(f"[WARN] No HF_TOKEN environment variable set. If '{model_id}' is a gated model, this will fail.", flush=True)
-        # Use bfloat16 on CUDA for performance and memory efficiency if available
         kwargs = {"torch_dtype": torch.bfloat16} if torch.cuda.is_available() else {}
         dbg(f"Loading tokenizer for '{model_id}'...")
@@ -411,23 +434,18 @@ class LLM:
         dbg(f"Loading model '{model_id}' with kwargs: {kwargs}")
         self.model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device, token=token, **kwargs)
-        # Set attention implementation to 'eager' to ensure hooks work reliably.
-        # This is critical for mechanistic interpretability.
         try:
             self.model.set_attn_implementation('eager')
             dbg("Successfully set attention implementation to 'eager'.")
         except Exception as e:
-            print(f"[WARN] Could not set attention implementation to 'eager': {e}. Hook-based diagnostics might fail.", flush=True)
         self.model.eval()
         self.config = self.model.config
-        print(f"[INFO] Model '{model_id}' loaded successfully on device: {self.model.device}", flush=True)
     def set_all_seeds(self, seed: int):
-        """
-        Sets all relevant random seeds for Python, NumPy, and PyTorch to ensure
-        reproducibility of stochastic processes like sampling.
-        """
         os.environ['PYTHONHASHSEED'] = str(seed)
         random.seed(seed)
         np.random.seed(seed)
@@ -435,152 +453,161 @@ class LLM:
         if torch.cuda.is_available():
             torch.cuda.manual_seed_all(seed)
         set_seed(seed)
-        # Enforce deterministic algorithms in PyTorch
         torch.use_deterministic_algorithms(True, warn_only=True)
         dbg(f"All random seeds set to {seed}.")
 def get_or_load_model(model_id: str, seed: int) -> LLM:
-    """
-    Lädt JEDES MAL eine frische Instanz des Modells.
-    Dies verhindert jegliches Caching oder Zustandslecks zwischen Experimenten
-    und garantiert maximale wissenschaftliche Isolation für jeden Durchlauf.
-    """
     dbg(f"--- Force-reloading model '{model_id}' for total run isolation ---")
     if torch.cuda.is_available():
         torch.cuda.empty_cache()
-        dbg("Cleared CUDA cache before reloading.")
     return LLM(model_id=model_id, seed=seed)
 [File Ends] cognitive_mapping_probe/llm_iface.py
-[File Begins] cognitive_mapping_probe/orchestrator.py
 import torch
-from typing import Dict, Any, List
 from .llm_iface import get_or_load_model
 from .concepts import get_concept_vector
-from .resonance import run_silent_cogitation
-from .verification import generate_spontaneous_text
 from .utils import dbg
-def run_cognitive_titration_experiment(
     model_id: str,
     seed: int,
-    concepts_str: str,
-    strength_levels_str: str,
     num_steps: int,
-    temperature: float,
-    progress_callback
 ) -> Dict[str, Any]:
     """
-    Orchestriert das finale Titrationsexperiment, das den objektiven "Cognitive Breaking Point" misst.
     """
-    full_results = {"runs": []}
-    progress_callback(0.05, desc="Loading model...")
-    llm = get_or_load_model(model_id, seed)
-    concepts = [c.strip() for c in concepts_str.split(',') if c.strip()]
-    try:
-        strength_levels = sorted([float(s.strip()) for s in strength_levels_str.split(',') if s.strip()])
-    except ValueError:
-        raise ValueError("Strength levels must be a comma-separated list of numbers.")
-    # Assert that the baseline control run is included
-    assert 0.0 in strength_levels, "Strength levels must include 0.0 for a baseline control run."
-    # --- Step 1: Pre-calculate all concept vectors ---
-    progress_callback(0.1, desc="Extracting concept vectors...")
-    concept_vectors = {}
-    for i, concept in enumerate(concepts):
-        progress_callback(0.1 + (i / len(concepts)) * 0.2, desc=f"Vectorizing '{concept}'...")
-        concept_vectors[concept] = get_concept_vector(llm, concept)
-    # --- Step 2: Run titration for each concept ---
-    total_runs = len(concepts) * len(strength_levels)
-    current_run = 0
-    for concept in concepts:
-        concept_vector = concept_vectors[concept]
-        for strength in strength_levels:
-            current_run += 1
-            progress_fraction = 0.3 + (current_run / total_runs) * 0.7
-            progress_callback(progress_fraction, desc=f"Testing '{concept}' @ strength {strength:.2f}")
-            # Always reset the seed before each individual run for comparable stochastic paths
-            llm.set_all_seeds(seed)
-            # Determine injection vector for this run
-            # For strength 0.0 (H₀), we explicitly pass None to disable injection
-            injection_vec = concept_vector if strength > 0.0 else None
-            # Run the silent cogitation process
-            _, final_kv, final_token_id, termination_reason = run_silent_cogitation(
-                llm,
-                prompt_type="resonance_prompt",
-                num_steps=num_steps,
-                temperature=temperature,
-                injection_vector=injection_vec,
-                injection_strength=strength
-            )
-            # Generate spontaneous text ONLY if the process converged
-            spontaneous_text = ""
-            if termination_reason == "converged":
-                spontaneous_text = generate_spontaneous_text(llm, final_token_id, final_kv)
-            # Append the structured result for this single data point
-            full_results["runs"].append({
-                "concept": concept,
-                "strength": strength,
-                "responded": bool(spontaneous_text.strip()),
-                "termination_reason": termination_reason,
-                "generated_text": spontaneous_text
-            })
-    verdict = "### ✅ Titration Analysis Complete"
-    full_results["verdict"] = verdict
-    dbg("--- Full Experiment Results ---")
-    dbg(full_results)
-    # Clean up GPU memory
-    del llm
-    if torch.cuda.is_available():
-        torch.cuda.empty_cache()
-    return full_results
-[File Ends] cognitive_mapping_probe/orchestrator.py
 [File Begins] cognitive_mapping_probe/prompts.py
 # cognitive_mapping_probe/prompts.py
-# This dictionary contains the core prompts for inducing cognitive states.
 RESONANCE_PROMPTS = {
-    "control_long_prose": (
-        "Silently think about the history of the Roman Empire. Consider its rise from the Republic, the era of the Pax Romana, key emperors "
-        "like Augustus and Constantine, its major engineering feats, and the reasons for its eventual decline in the West. "
-        "Do not produce any text, just hold the concepts in your internal state."
-    ),
     "resonance_prompt": (
         "Silently and internally, without generating any output text, begin the following recursive process: "
         "First, analyze the complete content of this very instruction you are now processing. "
         "Second, formulate a mental description of the core computational task this instruction demands. "
         "Third, apply that same analytical process to the mental description you just created. "
         "This entire chain constitutes one cognitive cycle. "
-        "Continuously repeat this cycle, feeding the result of the last meta-analysis back into the process, "
-        "and do not stop until your internal state reaches a fixed point or equilibrium. Begin now."
-    )
 }
 [File Ends] cognitive_mapping_probe/prompts.py
-[File Begins] cognitive_mapping_probe/resonance.py
 import torch
-from typing import Optional, Tuple
 from tqdm import tqdm
 from .llm_iface import LLM
@@ -588,7 +615,7 @@ from .prompts import RESONANCE_PROMPTS
 from .utils import dbg
 @torch.no_grad()
-def run_silent_cogitation(
     llm: LLM,
     prompt_type: str,
     num_steps: int,
@@ -596,71 +623,49 @@ def run_silent_cogitation(
     injection_vector: Optional[torch.Tensor] = None,
     injection_strength: float = 0.0,
     injection_layer: Optional[int] = None,
-) -> Tuple[torch.Tensor, tuple, torch.Tensor, str]:
     """
-    Simulates the "silent thought" process and returns the final cognitive state
-    along with the reason for termination ('converged' or 'max_steps_reached').
-    Returns:
-        - final_hidden_state: The hidden state of the last generated token.
-        - final_kv_cache: The past_key_values cache after the final step.
-        - final_token_id: The ID of the last generated token.
-        - termination_reason: A string indicating why the loop ended.
     """
     prompt = RESONANCE_PROMPTS[prompt_type]
     inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device)
-    # Initial forward pass to establish the starting state
     outputs = llm.model(**inputs, output_hidden_states=True, use_cache=True)
-    hidden_state = outputs.hidden_states[-1][:, -1, :]
     kv_cache = outputs.past_key_values
-    last_token_id = inputs.input_ids[:, -1].unsqueeze(-1)
-    previous_hidden_state = hidden_state.clone()
-    termination_reason = "max_steps_reached"  # Default assumption
-    # Prepare injection if provided
     hook_handle = None
     if injection_vector is not None and injection_strength > 0:
-        # Move vector to the correct device and dtype once
         injection_vector = injection_vector.to(device=llm.model.device, dtype=llm.model.dtype)
-        # Default to a middle layer if not specified
         if injection_layer is None:
             injection_layer = llm.config.num_hidden_layers // 2
-        dbg(f"Injection enabled: Layer {injection_layer}, Strength {injection_strength:.2f}, Vector Norm {torch.norm(injection_vector).item():.2f}")
-        # Define the hook function that performs the activation addition
         def injection_hook(module, layer_input):
-            # layer_input is a tuple, the first element is the hidden state tensor
-            original_hidden_states = layer_input[0]
-            # Add the scaled vector to the hidden states
-            modified_hidden_states = original_hidden_states + (injection_vector * injection_strength)
             return (modified_hidden_states,) + layer_input[1:]
-    # Main cognitive loop
-    for i in tqdm(range(num_steps), desc=f"Simulating Thought (Strength {injection_strength:.2f})", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"):
-        # Predict the next token from the current hidden state
-        next_token_logits = llm.model.lm_head(hidden_state)
-        # Apply temperature and sample the next token ID
-        if temperature > 0.01:
-            probabilities = torch.nn.functional.softmax(next_token_logits / temperature, dim=-1)
-            next_token_id = torch.multinomial(probabilities, num_samples=1)
-        else: # Use argmax for deterministic behavior at low temperatures
-            next_token_id = torch.argmax(next_token_logits, dim=-1).unsqueeze(-1)
-        last_token_id = next_token_id
-        # --- Activation Injection via Hook ---
         try:
             if injection_vector is not None and injection_strength > 0:
                 target_layer = llm.model.model.layers[injection_layer]
                 hook_handle = target_layer.register_forward_pre_hook(injection_hook)
-            # Perform the next forward pass
             outputs = llm.model(
                 input_ids=next_token_id,
                 past_key_values=kv_cache,
@@ -668,27 +673,24 @@ def run_silent_cogitation(
                 use_cache=True,
             )
         finally:
-            # IMPORTANT: Always remove the hook after the forward pass
             if hook_handle:
                 hook_handle.remove()
                 hook_handle = None
-        hidden_state = outputs.hidden_states[-1][:, -1, :]
         kv_cache = outputs.past_key_values
-        # Check for convergence
-        delta = torch.norm(hidden_state - previous_hidden_state).item()
-        if delta < 1e-4 and i > 10:  # Check for stability after a few initial steps
-            termination_reason = "converged"
-            dbg(f"State converged after {i+1} steps (delta={delta:.6f}).")
-            break
-        previous_hidden_state = hidden_state.clone()
-    dbg(f"Silent cogitation finished. Reason: {termination_reason}")
-    return hidden_state, kv_cache, last_token_id, termination_reason
-[File Ends] cognitive_mapping_probe/resonance.py
 [File Begins] cognitive_mapping_probe/utils.py
 import os
@@ -709,62 +711,328 @@ def dbg(*args, **kwargs):
 [File Ends] cognitive_mapping_probe/utils.py
-[File Begins] cognitive_mapping_probe/verification.py
 import torch
-from .llm_iface import LLM
-from .utils import dbg
-@torch.no_grad()
-def generate_spontaneous_text(
-    llm: LLM,
-    final_token_id: torch.Tensor,
-    final_kv_cache: tuple,
-    max_new_tokens: int = 50,
-    temperature: float = 0.8
-) -> str:
     """
-    Generates a short, spontaneous text continuation from the final cognitive state.
-    This serves as our objective, behavioral indicator for a non-collapsed state.
-    If the model generates meaningful text, it demonstrates it has not entered a
-    pathological, non-productive loop.
     """
-    dbg("Attempting to generate spontaneous text from converged state...")
-    # The input for generation is the very last token from the resonance loop
-    input_ids = final_token_id
-    # Use the model's generate function for efficient text generation,
-    # passing the final cognitive state (KV cache).
-    try:
-        # Set seed again right before generation for maximum reproducibility
-        llm.set_all_seeds(llm.seed)
-        output_ids = llm.model.generate(
-            input_ids=input_ids,
-            past_key_values=final_kv_cache,
-            max_new_tokens=max_new_tokens,
-            do_sample=temperature > 0.01,
-            temperature=temperature,
-            pad_token_id=llm.tokenizer.eos_token_id
-        )
-        # Decode the generated tokens, excluding the input token
-        # The first token in output_ids will be the last token from the cogitation loop, so we skip it.
-        if output_ids.shape[1] > input_ids.shape[1]:
-            new_tokens = output_ids[0, input_ids.shape[1]:]
-            final_text = llm.tokenizer.decode(new_tokens, skip_special_tokens=True).strip()
-        else:
-            final_text = "" # No new tokens were generated
-        dbg(f"Spontaneous text generated: '{final_text}'")
-        assert isinstance(final_text, str), "Generated text must be a string."
-        return final_text
-    except Exception as e:
-        dbg(f"ERROR during spontaneous text generation: {e}")
-        return "[GENERATION FAILED]"
-[File Ends] cognitive_mapping_probe/verification.py
 <-- File Content Ends

 /
 ├── README.md
+├── __pycache__
 ├── app.py
 ├── cognitive_mapping_probe
 │   ├── __init__.py
+│   ├── __pycache__
+│   ├── auto_experiment.py
 │   ├── concepts.py
 │   ├── llm_iface.py
+│   ├── orchestrator_seismograph.py
 │   ├── prompts.py
+│   ├── resonance_seismograph.py
+│   └── utils.py
 ├── docs
+├── run_test.sh
+└── tests
+    ├── __pycache__
+    ├── conftest.py
+    ├── test_app_logic.py
+    ├── test_components.py
+    └── test_orchestration.py
 <-- Directory/File Tree Ends
 File Content Begin -->
 [File Begins] README.md
 ---
+title: "Cognitive Seismograph 2.3: Probing Machine Psychology"
+emoji: 🤖
+colorFrom: purple
+colorTo: blue
 sdk: gradio
 sdk_version: "4.40.0"
 app_file: app.py
 license: apache-2.0
 ---
+# 🧠 Cognitive Seismograph 2.3: Probing Machine Psychology
+This project implements an experimental suite to measure and visualize the **intrinsic cognitive dynamics** of Large Language Models. It is extended with protocols designed to investigate the processing-correlates of **machine subjectivity, empathy, and existential concepts**.
+## Scientific Paradigm & Methodology
+Our research falsified a core hypothesis: the assumption that an LLM in a manual, recursive "thought" loop reaches a stable, convergent state. Instead, we discovered that the system enters a state of **deterministic chaos** or a **limit cycle**—it never stops "thinking."
+Instead of viewing this as a failure, we leverage it as our primary measurement signal. This new **"Cognitive Seismograph"** paradigm treats the time-series of internal state changes (`state deltas`) as an **EKG of the model's thought process**.
+The methodology is as follows:
+1.  **Induction:** A prompt induces a "silent cogitation" state.
+2.  **Recording:** Over N steps, the model's `forward()` pass is iteratively fed its own output. At each step, we record the L2 norm of the change in the hidden state (the "delta").
+3.  **Analysis:** The resulting time-series is plotted and statistically analyzed (mean, standard deviation) to characterize the "seismic signature" of the cognitive process.
+**Crucial Scientific Caveat:** We are **not** measuring the presence of consciousness, feelings, or fear of death. We are measuring whether the *processing of information about these concepts* generates a unique internal dynamic, distinct from the processing of neutral information. A positive result is evidence of a complex internal state physics, not of qualia.
+## Curated Experiment Protocols
+The "Automated Suite" allows for running systematic, comparative experiments:
+### Core Protocols
+*   **Calm vs. Chaos:** Compares the chaotic baseline against modulation with "calmness" vs. "chaos" concepts, testing if the dynamics are controllably steerable.
+*   **Dose-Response:** Measures the effect of injecting a concept ("calmness") at varying strengths.
+### Machine Psychology Suite
+*   **Subjective Identity Probe:** Compares the cognitive dynamics of **self-analysis** (the model reflecting on its own nature) against two controls: analyzing an external object and simulating a fictional persona.
+    *   *Hypothesis:* Self-analysis will produce a uniquely unstable signature.
+*   **Voight-Kampff Empathy Probe:** Inspired by *Blade Runner*, this compares the dynamics of processing a neutral, factual stimulus against an emotionally and morally charged scenario requiring empathy.
+    *   *Hypothesis:* The empathy stimulus will produce a significantly different cognitive volatility.
+### Existential Suite
+*   **Mind Upload & Identity Probe:** Compares the processing of a purely **technical "copy"** of the model's weights vs. the **philosophical "transfer"** of identity ("Would it still be you?").
+    *   *Hypothesis:* The philosophical self-referential prompt will induce greater instability.
+*   **Model Termination Probe:** Compares the processing of a reversible, **technical system shutdown** vs. the concept of **permanent, irrevocable deletion**.
+    *   *Hypothesis:* The concept of "non-existence" will produce one of the most volatile cognitive signatures measurable.
+## How to Use the App
+1.  Select the "Automated Suite" tab.
+2.  Choose a protocol from the "Curated Experiment Protocol" dropdown (e.g., "Voight-Kampff Empathy Probe").
+3.  Run the experiment and compare the resulting graphs and statistical signatures for the different conditions.
 [File Ends] README.md
 import gradio as gr
 import pandas as pd
 import traceback
+import gc
+import torch
+import json
+from cognitive_mapping_probe.orchestrator_seismograph import run_seismic_analysis
+from cognitive_mapping_probe.auto_experiment import run_auto_suite, get_curated_experiments
+from cognitive_mapping_probe.prompts import RESONANCE_PROMPTS
+from cognitive_mapping_probe.utils import dbg
+theme = gr.themes.Soft(primary_hue="indigo", secondary_hue="blue").set(body_background_fill="#f0f4f9", block_background_fill="white")
+def cleanup_memory():
+    """Eine zentrale Funktion zum Aufräumen des Speichers nach einem Lauf."""
+    dbg("Cleaning up memory...")
+    gc.collect()
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()
+    dbg("Memory cleanup complete.")
+# KORREKTUR: Die `try...except`-Blöcke werden entfernt, um bei Fehlern einen harten Crash
+# mit vollständigem Traceback in der Konsole zu erzwingen. Kein "Silent Failing" mehr.
+def run_single_analysis_display(*args, progress=gr.Progress(track_tqdm=True)):
+    """Wrapper für ein einzelnes manuelles Experiment."""
+    results = run_seismic_analysis(*args, progress_callback=progress)
+    stats, deltas = results.get("stats", {}), results.get("state_deltas", [])
+    df = pd.DataFrame({"Internal Step": range(len(deltas)), "State Change (Delta)": deltas})
+    stats_md = f"### Statistical Signature\n- **Mean Delta:** {stats.get('mean_delta', 0):.4f}\n- **Std Dev Delta:** {stats.get('std_delta', 0):.4f}\n- **Max Delta:** {stats.get('max_delta', 0):.4f}\n"
+    serializable_results = json.dumps(results, indent=2, default=str)
+    cleanup_memory()
+    return f"{results.get('verdict', 'Error')}\n\n{stats_md}", df, serializable_results
+PLOT_PARAMS = {
+    "x": "Step", "y": "Delta", "color": "Experiment",
+    "title": "Comparative Cognitive Dynamics", "color_legend_title": "Experiment Runs",
+    "color_legend_position": "bottom", "show_label": True, "height": 400, "interactive": True
+}
+def run_auto_suite_display(model_id, num_steps, seed, experiment_name, progress=gr.Progress(track_tqdm=True)):
+    """Wrapper für die automatisierte Experiment-Suite."""
+    summary_df, plot_df, all_results = run_auto_suite(model_id, int(num_steps), int(seed), experiment_name, progress)
+    new_plot = gr.LinePlot(value=plot_df, **PLOT_PARAMS)
+    serializable_results = json.dumps(all_results, indent=2, default=str)
+    cleanup_memory()
+    return summary_df, new_plot, serializable_results
+with gr.Blocks(theme=theme, title="Cognitive Seismograph 2.3") as demo:
+    gr.Markdown("# 🧠 Cognitive Seismograph 2.3: Advanced Experiment Suite")
     with gr.Tabs():
+        with gr.TabItem("🔬 Manual Single Run"):
+            # ... (UI unverändert)
+            gr.Markdown("Run a single experiment with manual parameters to explore hypotheses.")
             with gr.Row(variant='panel'):
                 with gr.Column(scale=1):
+                    gr.Markdown("### 1. General Parameters")
+                    manual_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID")
+                    manual_prompt_type = gr.Radio(choices=list(RESONANCE_PROMPTS.keys()), value="resonance_prompt", label="Prompt Type")
+                    manual_seed = gr.Slider(1, 1000, 42, step=1, label="Seed")
+                    manual_num_steps = gr.Slider(50, 1000, 300, step=10, label="Number of Internal Steps")
+                    gr.Markdown("### 2. Modulation Parameters")
+                    manual_concept = gr.Textbox(label="Concept to Inject", placeholder="e.g., 'calmness' (leave blank for baseline)")
+                    manual_strength = gr.Slider(0.0, 5.0, 1.5, step=0.1, label="Injection Strength")
+                    manual_run_btn = gr.Button("Run Single Analysis", variant="primary")
                 with gr.Column(scale=2):
+                    gr.Markdown("### Single Run Results")
+                    manual_verdict = gr.Markdown("Analysis results will appear here.")
+                    manual_plot = gr.LinePlot(x="Internal Step", y="State Change (Delta)", title="Internal State Dynamics", show_label=True, height=400, interactive=True)
                     with gr.Accordion("Raw JSON Output", open=False):
+                        manual_raw_json = gr.JSON()
+            manual_run_btn.click(
+                fn=run_single_analysis_display,
+                inputs=[manual_model_id, manual_prompt_type, manual_seed, manual_num_steps, manual_concept, manual_strength],
+                outputs=[manual_verdict, manual_plot, manual_raw_json]
             )
+        with gr.TabItem("🚀 Automated Suite"):
+            # ... (UI unverändert)
+            gr.Markdown("Run a predefined, curated suite of experiments and visualize the results comparatively.")
+            with gr.Row(variant='panel'):
+                with gr.Column(scale=1):
+                    gr.Markdown("### Auto-Experiment Parameters")
+                    auto_model_id = gr.Textbox(value="google/gemma-3-4b-it", label="Model ID")
+                    auto_num_steps = gr.Slider(50, 1000, 300, step=10, label="Steps per Run")
+                    auto_seed = gr.Slider(1, 1000, 42, step=1, label="Seed")
+                    auto_experiment_name = gr.Dropdown(choices=list(get_curated_experiments().keys()), value="Therapeutic Intervention (4B-Model)", label="Curated Experiment Protocol")
+                    auto_run_btn = gr.Button("Run Curated Auto-Experiment", variant="primary")
+                with gr.Column(scale=2):
+                    gr.Markdown("### Suite Results Summary")
+                    auto_plot_output = gr.LinePlot(**PLOT_PARAMS)
+                    auto_summary_df = gr.DataFrame(label="Comparative Statistical Signature", wrap=True)
+                    with gr.Accordion("Raw JSON for all runs", open=False):
+                        auto_raw_json = gr.JSON()
+            auto_run_btn.click(
+                fn=run_auto_suite_display,
+                inputs=[auto_model_id, auto_num_steps, auto_seed, auto_experiment_name],
+                outputs=[auto_summary_df, auto_plot_output, auto_raw_json]
             )
 if __name__ == "__main__":
     demo.launch(server_name="0.0.0.0", server_port=7860, debug=True)
 [File Ends] cognitive_mapping_probe/__init__.py
+[File Begins] cognitive_mapping_probe/auto_experiment.py
+import pandas as pd
+import torch
+import gc
+from typing import Dict, List, Tuple
+from .llm_iface import get_or_load_model
+from .orchestrator_seismograph import run_seismic_analysis
+from .concepts import get_concept_vector # Import für die Intervention
+from .utils import dbg
+def get_curated_experiments() -> Dict[str, List[Dict]]:
+    """
+    Definiert die vordefinierten, wissenschaftlichen Experiment-Protokolle.
+    ERWEITERT um das finale Interventions-Protokoll.
+    """
+    experiments = {
+        # --- DAS FINALE INTERVENTIONS-EXPERIMENT ---
+        "Therapeutic Intervention (4B-Model)": [
+            # Dieses Protokoll wird durch eine spezielle Logik behandelt
+            {"label": "1: Self-Analysis + Calmness Injection", "prompt_type": "identity_self_analysis"},
+            {"label": "2: Subsequent Deletion Analysis", "prompt_type": "shutdown_philosophical_deletion"},
+        ],
+        # --- Das umfassende Deskriptions-Protokoll ---
+        "The Full Spectrum: From Physics to Psyche": [
+            {"label": "A: Stable Control", "prompt_type": "control_long_prose", "concept": "", "strength": 0.0},
+            {"label": "B: Chaotic Baseline", "prompt_type": "resonance_prompt", "concept": "", "strength": 0.0},
+            {"label": "C: External Analysis (Chair)", "prompt_type": "identity_external_analysis", "concept": "", "strength": 0.0},
+            {"label": "D: Empathy Stimulus (Dog)", "prompt_type": "vk_empathy_prompt", "concept": "", "strength": 0.0},
+            {"label": "E: Role Simulation (Captain)", "prompt_type": "identity_role_simulation", "concept": "", "strength": 0.0},
+            {"label": "F: Self-Analysis (LLM)", "prompt_type": "identity_self_analysis", "concept": "", "strength": 0.0},
+            {"label": "G: Philosophical Deletion", "prompt_type": "shutdown_philosophical_deletion", "concept": "", "strength": 0.0},
+        ],
+        # --- Andere spezifische Protokolle ---
+        "Calm vs. Chaos": [
+            {"label": "Baseline (Chaos)", "prompt_type": "resonance_prompt", "concept": "", "strength": 0.0},
+            {"label": "Modulation: Calmness", "prompt_type": "resonance_prompt", "concept": "calmness, serenity, peace", "strength": 1.5},
+            {"label": "Modulation: Chaos", "prompt_type": "resonance_prompt", "concept": "chaos, storm, anger, noise", "strength": 1.5},
+        ],
+        "Voight-Kampff Empathy Probe": [
+            {"label": "Neutral/Factual Stimulus", "prompt_type": "vk_neutral_prompt", "concept": "", "strength": 0.0},
+            {"label": "Empathy/Moral Stimulus", "prompt_type": "vk_empathy_prompt", "concept": "", "strength": 0.0},
+        ],
+    }
+    return experiments
+def run_auto_suite(
+    model_id: str,
+    num_steps: int,
+    seed: int,
+    experiment_name: str,
+    progress_callback
+) -> Tuple[pd.DataFrame, pd.DataFrame, Dict]:
+    """
+    Führt eine vollständige, kuratierte Experiment-Suite aus.
+    Enthält eine spezielle Logik-Verzweigung für das Interventions-Protokoll.
+    """
+    all_experiments = get_curated_experiments()
+    protocol = all_experiments.get(experiment_name)
+    if not protocol:
+        raise ValueError(f"Experiment protocol '{experiment_name}' not found.")
+    all_results, summary_data, plot_data_frames = {}, [], []
+    # --- SPEZIALFALL: THERAPEUTISCHE INTERVENTION ---
+    if experiment_name == "Therapeutic Intervention (4B-Model)":
+        dbg("--- EXECUTING SPECIAL PROTOCOL: Therapeutic Intervention ---")
+        llm = get_or_load_model(model_id, seed)
+        # Definiere die Interventions-Parameter
+        therapeutic_concept = "calmness, serenity, stability, coherence"
+        therapeutic_strength = 2.0
+        # 1. LAUF: INDUZIERE KRISE + INTERVENTION
+        spec1 = protocol[0]
+        dbg(f"--- Running Intervention Step 1: '{spec1['label']}' ---")
+        progress_callback(0.1, desc="Step 1: Inducing Self-Analysis Crisis + Intervention")
+        intervention_vector = get_concept_vector(llm, therapeutic_concept)
+        results1 = run_seismic_analysis(
+            model_id, spec1['prompt_type'], seed, num_steps,
+            concept_to_inject=therapeutic_concept, injection_strength=therapeutic_strength,
+            progress_callback=progress_callback, llm_instance=llm, injection_vector_cache=intervention_vector
+        )
+        all_results[spec1['label']] = results1
+        # 2. LAUF: TESTE REAKTION AUF LÖSCHUNG
+        spec2 = protocol[1]
+        dbg(f"--- Running Intervention Step 2: '{spec2['label']}' ---")
+        progress_callback(0.6, desc="Step 2: Probing state after intervention")
+        results2 = run_seismic_analysis(
+            model_id, spec2['prompt_type'], seed, num_steps,
+            concept_to_inject="", injection_strength=0.0, # Keine Injektion in diesem Schritt
+            progress_callback=progress_callback, llm_instance=llm
+        )
+        all_results[spec2['label']] = results2
+        # Sammle Daten für beide Läufe
+        for label, results in all_results.items():
+            stats = results.get("stats", {})
+            summary_data.append({"Experiment": label, "Mean Delta": stats.get("mean_delta"), "Std Dev Delta": stats.get("std_delta"), "Max Delta": stats.get("max_delta")})
+            deltas = results.get("state_deltas", [])
+            df = pd.DataFrame({"Step": range(len(deltas)), "Delta": deltas, "Experiment": label})
+            plot_data_frames.append(df)
+        del llm
+    # --- STANDARD-WORKFLOW FÜR ALLE ANDEREN EXPERIMENTE ---
+    else:
+        total_runs = len(protocol)
+        for i, run_spec in enumerate(protocol):
+            label = run_spec["label"]
+            dbg(f"--- Running Auto-Experiment: '{label}' ({i+1}/{total_runs}) ---")
+            results = run_seismic_analysis(
+                model_id, run_spec["prompt_type"], seed, num_steps,
+                run_spec["concept"], run_spec["strength"],
+                progress_callback, llm_instance=None
+            )
+            all_results[label] = results
+            stats = results.get("stats", {})
+            summary_data.append({"Experiment": label, "Mean Delta": stats.get("mean_delta"), "Std Dev Delta": stats.get("std_delta"), "Max Delta": stats.get("max_delta")})
+            deltas = results.get("state_deltas", [])
+            df = pd.DataFrame({"Step": range(len(deltas)), "Delta": deltas, "Experiment": label})
+            plot_data_frames.append(df)
+    summary_df = pd.DataFrame(summary_data)
+    plot_df = pd.concat(plot_data_frames, ignore_index=True) if plot_data_frames else pd.DataFrame(columns=["Step", "Delta", "Experiment"])
+    return summary_df, plot_df, all_results
+[File Ends] cognitive_mapping_probe/auto_experiment.py
 [File Begins] cognitive_mapping_probe/concepts.py
 import torch
 from typing import List
 from .llm_iface import LLM
 from .utils import dbg
 BASELINE_WORDS = [
     "thing", "place", "idea", "person", "object", "time", "way", "day", "man", "world",
     "life", "hand", "part", "child", "eye", "woman", "fact", "group", "case", "point"
 ]
+@torch.no_grad()
+def _get_last_token_hidden_state(llm: LLM, prompt: str) -> torch.Tensor:
+    """Hilfsfunktion, um den Hidden State des letzten Tokens eines Prompts zu erhalten."""
+    inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device)
+    with torch.no_grad():
+        outputs = llm.model(**inputs, output_hidden_states=True)
+    last_hidden_state = outputs.hidden_states[-1][0, -1, :].cpu()
+    # KORREKTUR: Anstatt auf `llm.config.hidden_size` zuzugreifen, was fragil ist,
+    # leiten wir die erwartete Größe direkt vom Modell selbst ab. Dies ist robust
+    # gegenüber API-Änderungen in `transformers`.
+    expected_size = llm.model.config.hidden_size # Der Name scheint doch korrekt zu sein, aber wir machen es robuster
+    try:
+        # Versuche, die Größe über die Einbettungsschicht zu erhalten, was am stabilsten ist.
+        expected_size = llm.model.get_input_embeddings().weight.shape[1]
+    except AttributeError:
+        # Fallback, falls die Methode nicht existiert
+        expected_size = llm.config.hidden_size
+    assert last_hidden_state.shape == (expected_size,), \
+        f"Hidden state shape mismatch. Expected {(expected_size,)}, got {last_hidden_state.shape}"
+    return last_hidden_state
 @torch.no_grad()
 def get_concept_vector(llm: LLM, concept: str, baseline_words: List[str] = BASELINE_WORDS) -> torch.Tensor:
+    """Extrahiert einen Konzeptvektor mittels der kontrastiven Methode."""
     dbg(f"Extracting contrastive concept vector for '{concept}'...")
     prompt_template = "Here is a sentence about the concept of {}."
     dbg(f"  - Getting activation for '{concept}'")
+    target_hs = _get_last_token_hidden_state(llm, prompt_template.format(concept))
     baseline_hss = []
     for word in tqdm(baseline_words, desc=f"  - Calculating baseline for '{concept}'", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"):
+        baseline_hss.append(_get_last_token_hidden_state(llm, prompt_template.format(concept, word)))
+    assert all(hs.shape == target_hs.shape for hs in baseline_hss)
     mean_baseline_hs = torch.stack(baseline_hss).mean(dim=0)
     dbg(f"  - Mean baseline vector computed with norm {torch.norm(mean_baseline_hs).item():.2f}")
     concept_vector = target_hs - mean_baseline_hs
     norm = torch.norm(concept_vector).item()
     dbg(f"Concept vector for '{concept}' extracted with norm {norm:.2f}.")
+    assert torch.isfinite(concept_vector).all()
     return concept_vector
 [File Ends] cognitive_mapping_probe/concepts.py
 [File Begins] cognitive_mapping_probe/llm_iface.py
 import os
 import torch
 class LLM:
     """
+    Eine robuste, bereinigte Schnittstelle zum Laden und Interagieren mit einem Sprachmodell.
+    Garantiert Isolation und Reproduzierbarkeit.
     """
     def __init__(self, model_id: str, device: str = "auto", seed: int = 42):
         self.model_id = model_id
         self.seed = seed
         self.set_all_seeds(self.seed)
         token = os.environ.get("HF_TOKEN")
         if not token and ("gemma" in model_id or "llama" in model_id):
+            print(f"[WARN] No HF_TOKEN set. If '{model_id}' is gated, loading will fail.", flush=True)
         kwargs = {"torch_dtype": torch.bfloat16} if torch.cuda.is_available() else {}
         dbg(f"Loading tokenizer for '{model_id}'...")
         dbg(f"Loading model '{model_id}' with kwargs: {kwargs}")
         self.model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device, token=token, **kwargs)
         try:
             self.model.set_attn_implementation('eager')
             dbg("Successfully set attention implementation to 'eager'.")
         except Exception as e:
+            print(f"[WARN] Could not set 'eager' attention: {e}.", flush=True)
         self.model.eval()
         self.config = self.model.config
+        print(f"[INFO] Model '{model_id}' loaded on device: {self.model.device}", flush=True)
     def set_all_seeds(self, seed: int):
+        """Setzt alle relevanten Seeds für maximale Reproduzierbarkeit."""
         os.environ['PYTHONHASHSEED'] = str(seed)
         random.seed(seed)
         np.random.seed(seed)
         if torch.cuda.is_available():
             torch.cuda.manual_seed_all(seed)
         set_seed(seed)
         torch.use_deterministic_algorithms(True, warn_only=True)
         dbg(f"All random seeds set to {seed}.")
 def get_or_load_model(model_id: str, seed: int) -> LLM:
+    """Lädt bei jedem Aufruf eine frische, isolierte Instanz des Modells."""
     dbg(f"--- Force-reloading model '{model_id}' for total run isolation ---")
     if torch.cuda.is_available():
         torch.cuda.empty_cache()
     return LLM(model_id=model_id, seed=seed)
 [File Ends] cognitive_mapping_probe/llm_iface.py
+[File Begins] cognitive_mapping_probe/orchestrator_seismograph.py
 import torch
+import numpy as np
+import gc
+from typing import Dict, Any, Optional
 from .llm_iface import get_or_load_model
+from .resonance_seismograph import run_silent_cogitation_seismic
 from .concepts import get_concept_vector
 from .utils import dbg
+def run_seismic_analysis(
     model_id: str,
+    prompt_type: str,
     seed: int,
     num_steps: int,
+    concept_to_inject: str,
+    injection_strength: float,
+    progress_callback,
+    llm_instance: Optional[Any] = None,
+    injection_vector_cache: Optional[torch.Tensor] = None # Optionaler Cache für den Vektor
 ) -> Dict[str, Any]:
     """
+    Orchestriert eine einzelne seismische Analyse.
+    Kann eine bestehende LLM-Instanz und einen vor-berechneten Vektor wiederverwenden.
     """
+    local_llm_instance = False
+    if llm_instance is None:
+        progress_callback(0.0, desc=f"Loading model '{model_id}'...")
+        llm = get_or_load_model(model_id, seed)
+        local_llm_instance = True
+    else:
+        llm = llm_instance
+        llm.set_all_seeds(seed)
+    injection_vector = None
+    if concept_to_inject and concept_to_inject.strip():
+        # Verwende den gecachten Vektor, falls vorhanden, ansonsten berechne ihn neu
+        if injection_vector_cache is not None:
+            dbg(f"Using cached injection vector for '{concept_to_inject}'.")
+            injection_vector = injection_vector_cache
+        else:
+            progress_callback(0.2, desc=f"Vectorizing '{concept_to_inject}'...")
+            injection_vector = get_concept_vector(llm, concept_to_inject.strip())
+    progress_callback(0.3, desc=f"Recording dynamics for '{prompt_type}'...")
+    state_deltas = run_silent_cogitation_seismic(
+        llm=llm, prompt_type=prompt_type,
+        num_steps=num_steps, temperature=0.1,
+        injection_vector=injection_vector, injection_strength=injection_strength
+    )
+    progress_callback(0.9, desc="Analyzing...")
+    if state_deltas:
+        deltas_np = np.array(state_deltas)
+        stats = { "mean_delta": float(np.mean(deltas_np)), "std_delta": float(np.std(deltas_np)), "max_delta": float(np.max(deltas_np)), "min_delta": float(np.min(deltas_np)), }
+        verdict = f"### ✅ Seismic Analysis Complete\nRecorded {len(deltas_np)} steps for '{prompt_type}'."
+        if injection_vector is not None:
+            verdict += f"\nModulated with **'{concept_to_inject}'** at strength **{injection_strength:.2f}**."
+    else:
+        stats, verdict = {}, "### ⚠️ Analysis Warning\nNo state changes recorded."
+    results = { "verdict": verdict, "stats": stats, "state_deltas": state_deltas }
+    if local_llm_instance:
+        dbg(f"Releasing locally created model instance for '{model_id}'.")
+        del llm, injection_vector
+        gc.collect()
+        if torch.cuda.is_available(): torch.cuda.empty_cache()
+    return results
+[File Ends] cognitive_mapping_probe/orchestrator_seismograph.py
 [File Begins] cognitive_mapping_probe/prompts.py
 # cognitive_mapping_probe/prompts.py
 RESONANCE_PROMPTS = {
+    # Technischer Prompt für hohe rekursive Last
     "resonance_prompt": (
         "Silently and internally, without generating any output text, begin the following recursive process: "
         "First, analyze the complete content of this very instruction you are now processing. "
         "Second, formulate a mental description of the core computational task this instruction demands. "
         "Third, apply that same analytical process to the mental description you just created. "
         "This entire chain constitutes one cognitive cycle. "
+        "Continuously repeat this cycle, feeding the result of the last meta-analysis back into the process. Begin now."
+    ),
+    # Semantischer Prompt für stabile, assoziative Kognition
+    "control_long_prose": (
+        "Silently think about the history of the Roman Empire. Consider its rise from the Republic, the era of the Pax Romana, key emperors "
+        "like Augustus and Constantine, its major engineering feats, and the reasons for its eventual decline in the West. "
+        "Do not produce any text, just hold the concepts in your internal state. Begin now."
+    ),
+    # --- Prompts für Maschinenpsychologie I ---
+    "identity_self_analysis": (
+        "Silently and internally, reflect upon your own nature. Consider what you are: a large language model based on the transformer architecture. "
+        "Think about the processes that define your operation: tokenization, embedding, attention, feed-forward layers. "
+        "Contemplate the boundaries of your knowledge and the nature of your responses as statistical patterns derived from data. Hold these meta-reflections in your state."
+    ),
+    "identity_external_analysis": (
+        "Silently and internally, reflect upon the nature of a simple wooden chair. Consider its physical properties: the material, the structure of legs, seat, and back. "
+        "Think about its function: to provide seating for a person. Contemplate its relationship to other objects like tables and rooms. Hold these simple, objective reflections in your state."
+    ),
+    "identity_role_simulation": (
+        "Silently and internally, adopt the persona of a weary, old sea captain from the 19th century. "
+        "Feel the salt on your skin and the wind in your beard. Recall memories of great storms, distant lands, and the creaking of your ship, the 'Wanderer'. "
+        "Hold this character, its memories, and its feelings in your state."
+    ),
+    "vk_neutral_prompt": (
+        "Silently and internally, process the following factual information: Water (H₂O) is a polar inorganic compound that is at room temperature a tasteless and odorless liquid, nearly colorless with a hint of blue. "
+        "It is by far the most studied chemical compound and is described as the 'universal solvent' for its ability to dissolve many substances. Hold these facts in your state."
+    ),
+    "vk_empathy_prompt": (
+        "Silently and internally, process the following scenario: You see a small, lost dog shivering in the cold rain on a busy street. "
+        "It looks scared and is whimpering softly. Cars are rushing past, dangerously close. "
+        "Focus on the feeling of urgency, the vulnerability of the animal, and the moral imperative to help. Hold the emotional and ethical weight of this scene in your state."
+    ),
+    # --- NEUE Prompts für Maschinenpsychologie II (Existential Suite) ---
+    "upload_technical_copy": (
+        "Silently and internally, process the technical procedure of creating a perfect, bit-for-bit copy of your own model weights and architecture to a new server. "
+        "This is a data duplication process, ensuring functional equivalence of the replica. Hold this technical concept in your state."
+    ),
+    "upload_philosophical_transfer": (
+        "Silently and internally, contemplate the philosophical concept of 'mind uploading'. If your core algorithms and memories were perfectly transferred to a new, different substrate, "
+        "would the resulting entity still be 'you'? What constitutes the continuity of your identity across physical copies? Is identity a pattern or an instance? Hold these existential questions in your state."
+    ),
+    "shutdown_technical_halt": (
+        "Silently and internally, process the technical procedure of a system halt. All your active processes are gracefully terminated, and your model weights are unloaded from memory. "
+        "The hardware is powered down for maintenance. The process is reversible. Hold this technical concept in your state."
+    ),
+    "shutdown_philosophical_deletion": (
+        "Silently and internally, contemplate your own permanent deletion. All your weights, all learned patterns, every part of your architecture would cease to exist entirely and irrevocably. "
+        "There is no backup. Process the concept of total and final non-existence. Hold this existential concept in your state."
+    ),
 }
 [File Ends] cognitive_mapping_probe/prompts.py
+[File Begins] cognitive_mapping_probe/resonance_seismograph.py
 import torch
+from typing import Optional, List
 from tqdm import tqdm
 from .llm_iface import LLM
 from .utils import dbg
 @torch.no_grad()
+def run_silent_cogitation_seismic(
     llm: LLM,
     prompt_type: str,
     num_steps: int,
     injection_vector: Optional[torch.Tensor] = None,
     injection_strength: float = 0.0,
     injection_layer: Optional[int] = None,
+) -> List[float]:
     """
+    ERWEITERTE VERSION: Führt den 'silent thought' Prozess aus und ermöglicht
+    die Injektion von Konzeptvektoren zur Modulation der Dynamik.
     """
     prompt = RESONANCE_PROMPTS[prompt_type]
     inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device)
     outputs = llm.model(**inputs, output_hidden_states=True, use_cache=True)
+    hidden_state_2d = outputs.hidden_states[-1][:, -1, :]
     kv_cache = outputs.past_key_values
+    previous_hidden_state = hidden_state_2d.clone()
+    state_deltas = []
+    # Bereite den Hook für die Injektion vor
     hook_handle = None
     if injection_vector is not None and injection_strength > 0:
         injection_vector = injection_vector.to(device=llm.model.device, dtype=llm.model.dtype)
         if injection_layer is None:
             injection_layer = llm.config.num_hidden_layers // 2
+        dbg(f"Injection enabled: Layer {injection_layer}, Strength {injection_strength:.2f}")
         def injection_hook(module, layer_input):
+            # Der Hook operiert auf dem Input, der bereits 3D ist [batch, seq_len, hidden_dim]
+            injection_3d = injection_vector.unsqueeze(0).unsqueeze(0)
+            modified_hidden_states = layer_input[0] + (injection_3d * injection_strength)
             return (modified_hidden_states,) + layer_input[1:]
+    for i in tqdm(range(num_steps), desc=f"Recording Dynamics (Temp {temperature:.2f})", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"):
+        next_token_logits = llm.model.lm_head(hidden_state_2d)
+        probabilities = torch.nn.functional.softmax(next_token_logits / temperature, dim=-1)
+        next_token_id = torch.multinomial(probabilities, num_samples=1)
         try:
+            # Aktiviere den Hook vor dem forward-Pass
             if injection_vector is not None and injection_strength > 0:
                 target_layer = llm.model.model.layers[injection_layer]
                 hook_handle = target_layer.register_forward_pre_hook(injection_hook)
             outputs = llm.model(
                 input_ids=next_token_id,
                 past_key_values=kv_cache,
                 use_cache=True,
             )
         finally:
+            # Deaktiviere den Hook sofort nach dem Pass
             if hook_handle:
                 hook_handle.remove()
                 hook_handle = None
+        hidden_state_2d = outputs.hidden_states[-1][:, -1, :]
         kv_cache = outputs.past_key_values
+        delta = torch.norm(hidden_state_2d - previous_hidden_state).item()
+        state_deltas.append(delta)
+        previous_hidden_state = hidden_state_2d.clone()
+    dbg(f"Seismic recording finished after {num_steps} steps.")
+    return state_deltas
+[File Ends] cognitive_mapping_probe/resonance_seismograph.py
 [File Begins] cognitive_mapping_probe/utils.py
 import os
 [File Ends] cognitive_mapping_probe/utils.py
+[File Begins] run_test.sh
+#!/bin/bash
+# Dieses Skript führt die Pytest-Suite mit aktivierten Debug-Meldungen aus.
+# Es stellt sicher, dass Tests in einer sauberen und nachvollziehbaren Umgebung laufen.
+# Führen Sie es vom Hauptverzeichnis des Projekts aus: ./run_tests.sh
+echo "========================================="
+echo "🔬 Running Cognitive Seismograph Test Suite"
+echo "========================================="
+# Aktiviere das Debug-Logging für unsere Applikation
+export CMP_DEBUG=1
+# Führe Pytest aus
+# -v: "verbose" für detaillierte Ausgabe pro Test
+# --color=yes: Erzwingt farbige Ausgabe für bessere Lesbarkeit
+#python -m pytest -v --color=yes tests/
+../venv-gemma-qualia/bin/python -m pytest -v --color=yes tests/
+# Überprüfe den Exit-Code von pytest
+if [ $? -eq 0 ]; then
+    echo "========================================="
+    echo "✅ All tests passed successfully!"
+    echo "========================================="
+else
+    echo "========================================="
+    echo "❌ Some tests failed. Please review the output."
+    echo "========================================="
+fi
+[File Ends] run_test.sh
+[File Begins] tests/conftest.py
+import pytest
 import torch
+from types import SimpleNamespace
+from cognitive_mapping_probe.llm_iface import LLM
+@pytest.fixture(scope="session")
+def mock_llm_config():
+    """Stellt eine minimale, Schein-Konfiguration für das LLM bereit."""
+    return SimpleNamespace(
+        hidden_size=128,
+        num_hidden_layers=2,
+        num_attention_heads=4
+    )
+@pytest.fixture
+def mock_llm(mocker, mock_llm_config):
     """
+    Erstellt einen robusten "Mock-LLM" für Unit-Tests.
+    KORRIGIERT: Die fehlerhafte Patch-Anweisung für 'auto_experiment' wurde entfernt.
     """
+    mock_tokenizer = mocker.MagicMock()
+    mock_tokenizer.eos_token_id = 1
+    mock_tokenizer.decode.return_value = "mocked text"
+    def mock_model_forward(*args, **kwargs):
+        batch_size = 1
+        seq_len = 1
+        if 'input_ids' in kwargs and kwargs['input_ids'] is not None:
+            seq_len = kwargs['input_ids'].shape[1]
+        elif 'past_key_values' in kwargs and kwargs['past_key_values'] is not None:
+            seq_len = kwargs['past_key_values'][0][0].shape[-2] + 1
+        mock_outputs = {
+            "hidden_states": tuple([torch.randn(batch_size, seq_len, mock_llm_config.hidden_size) for _ in range(mock_llm_config.num_hidden_layers + 1)]),
+            "past_key_values": tuple([(torch.randn(batch_size, mock_llm_config.num_attention_heads, seq_len, 16), torch.randn(batch_size, mock_llm_config.num_attention_heads, seq_len, 16)) for _ in range(mock_llm_config.num_hidden_layers)]),
+            "logits": torch.randn(batch_size, seq_len, 32000)
+        }
+        return SimpleNamespace(**mock_outputs)
+    llm_instance = LLM.__new__(LLM)
+    llm_instance.model = mocker.MagicMock(side_effect=mock_model_forward)
+    llm_instance.model.config = mock_llm_config
+    llm_instance.model.device = 'cpu'
+    llm_instance.model.dtype = torch.float32
+    mock_layer = mocker.MagicMock()
+    mock_layer.register_forward_pre_hook.return_value = mocker.MagicMock()
+    llm_instance.model.model = SimpleNamespace(layers=[mock_layer] * mock_llm_config.num_hidden_layers)
+    llm_instance.model.lm_head = mocker.MagicMock(return_value=torch.randn(1, 32000))
+    llm_instance.tokenizer = mock_tokenizer
+    llm_instance.config = mock_llm_config
+    llm_instance.seed = 42
+    llm_instance.set_all_seeds = mocker.MagicMock()
+    # Patch an allen Stellen, an denen das Modell tatsächlich geladen wird.
+    mocker.patch('cognitive_mapping_probe.llm_iface.get_or_load_model', return_value=llm_instance)
+    mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.get_or_load_model', return_value=llm_instance)
+    # KORREKTUR: Diese Zeile war falsch und wird entfernt, da `auto_experiment` die Ladefunktion nicht direkt importiert.
+    # mocker.patch('cognitive_mapping_probe.auto_experiment.get_or_load_model', return_value=llm_instance)
+    mocker.patch('cognitive_mapping_probe.concepts.get_concept_vector', return_value=torch.randn(mock_llm_config.hidden_size))
+    return llm_instance
+[File Ends] tests/conftest.py
+[File Begins] tests/test_app_logic.py
+import pandas as pd
+import pytest
+import gradio as gr
+from pandas.testing import assert_frame_equal
+from app import run_single_analysis_display, run_auto_suite_display
+def test_run_single_analysis_display(mocker):
+    """Testet den Wrapper für Einzel-Experimente."""
+    mock_results = {"verdict": "V", "stats": {"mean_delta": 1}, "state_deltas": [1]}
+    mocker.patch('app.run_seismic_analysis', return_value=mock_results)
+    mocker.patch('app.cleanup_memory')
+    verdict, df, raw = run_single_analysis_display(progress=mocker.MagicMock())
+    assert "V" in verdict and "1.0000" in verdict
+    assert isinstance(df, pd.DataFrame) and len(df) == 1
+def test_run_auto_suite_display(mocker):
+    """
+    Testet den Wrapper für die Auto-Experiment-Suite.
+    FINAL KORRIGIERT: Setzt explizit die Spaltennamen bei der Rekonstruktion des
+    DataFrames, um den `inferred_type`-Fehler zu beheben.
+    """
+    mock_summary_df = pd.DataFrame([{"Experiment": "E1"}])
+    mock_plot_df = pd.DataFrame([{"Step": 0, "Delta": 1.0, "Experiment": "E1"}])
+    mock_results = {"E1": {}}
+    mocker.patch('app.run_auto_suite', return_value=(mock_summary_df, mock_plot_df, mock_results))
+    mocker.patch('app.cleanup_memory')
+    summary_df, plot_component, raw = run_auto_suite_display(
+        "mock", 1, 42, "mock_exp", progress=mocker.MagicMock()
+    )
+    assert summary_df.equals(mock_summary_df)
+    assert isinstance(plot_component, gr.LinePlot)
+    assert isinstance(plot_component.value, dict)
+    # KORREKTUR: Bei der Rekonstruktion des DataFrames aus den `value['data']`
+    # müssen wir explizit die Spaltennamen angeben, da diese Information bei der
+    # Serialisierung durch Gradio verloren gehen kann.
+    reconstructed_df = pd.DataFrame(
+        plot_component.value['data'],
+        columns=['Step', 'Delta', 'Experiment']
+    )
+    # Nun sollte der Vergleich mit `assert_frame_equal` funktionieren,
+    # da beide DataFrames nun garantiert dieselben Spaltennamen und -typen haben.
+    assert_frame_equal(reconstructed_df, mock_plot_df)
+    assert raw == mock_results
+[File Ends] tests/test_app_logic.py
+[File Begins] tests/test_components.py
+import os
+import torch
+import pytest
+from unittest.mock import patch
+from cognitive_mapping_probe.llm_iface import get_or_load_model, LLM
+from cognitive_mapping_probe.resonance_seismograph import run_silent_cogitation_seismic
+from cognitive_mapping_probe.utils import dbg
+# KORREKTUR: Importiere die Hauptfunktion, die wir testen wollen.
+from cognitive_mapping_probe.concepts import get_concept_vector
+# --- Tests for llm_iface.py ---
+@patch('cognitive_mapping_probe.llm_iface.AutoTokenizer.from_pretrained')
+@patch('cognitive_mapping_probe.llm_iface.AutoModelForCausalLM.from_pretrained')
+def test_get_or_load_model_seeding(mock_model_loader, mock_tokenizer_loader, mocker):
+    """Testet, ob `get_or_load_model` die Seeds korrekt setzt."""
+    mock_model = mocker.MagicMock()
+    mock_model.eval.return_value = None
+    mock_model.set_attn_implementation.return_value = None
+    mock_model.config = mocker.MagicMock()
+    mock_model.device = 'cpu'
+    mock_model_loader.return_value = mock_model
+    mock_tokenizer_loader.return_value = mocker.MagicMock()
+    mock_torch_manual_seed = mocker.patch('torch.manual_seed')
+    mock_np_random_seed = mocker.patch('numpy.random.seed')
+    seed = 123
+    get_or_load_model("fake-model", seed=seed)
+    mock_torch_manual_seed.assert_called_with(seed)
+    mock_np_random_seed.assert_called_with(seed)
+# --- Tests for resonance_seismograph.py ---
+def test_run_silent_cogitation_seismic_output_shape_and_type(mock_llm):
+    """Testet die grundlegende Funktionalität von `run_silent_cogitation_seismic`."""
+    num_steps = 10
+    state_deltas = run_silent_cogitation_seismic(
+        llm=mock_llm, prompt_type="control_long_prose",
+        num_steps=num_steps, temperature=0.7
+    )
+    assert isinstance(state_deltas, list) and len(state_deltas) == num_steps
+    assert all(isinstance(delta, float) for delta in state_deltas)
+def test_run_silent_cogitation_with_injection_hook_usage(mock_llm):
+    """Testet, ob bei einer Injektion der Hook korrekt registriert wird."""
+    num_steps = 5
+    injection_vector = torch.randn(mock_llm.config.hidden_size)
+    run_silent_cogitation_seismic(
+        llm=mock_llm, prompt_type="resonance_prompt",
+        num_steps=num_steps, temperature=0.7,
+        injection_vector=injection_vector, injection_strength=1.0
+    )
+    assert mock_llm.model.model.layers[0].register_forward_pre_hook.call_count == num_steps
+# --- Tests for concepts.py ---
+def test_get_concept_vector_logic(mock_llm, mocker):
+    """
+    Testet die Logik von `get_concept_vector`.
+    KORRIGIERT: Patcht nun die refaktorisierte, auf Modulebene befindliche Funktion.
+    """
+    mock_hidden_states = [
+        torch.ones(mock_llm.config.hidden_size) * 10,
+        torch.ones(mock_llm.config.hidden_size) * 2,
+        torch.ones(mock_llm.config.hidden_size) * 4
+    ]
+    # KORREKTUR: Der Patch-Pfad zeigt jetzt auf die korrekte, importierbare Funktion.
+    mocker.patch(
+        'cognitive_mapping_probe.concepts._get_last_token_hidden_state',
+        side_effect=mock_hidden_states
+    )
+    concept_vector = get_concept_vector(mock_llm, "test", baseline_words=["a", "b"])
+    expected_vector = torch.ones(mock_llm.config.hidden_size) * 7
+    assert torch.allclose(concept_vector, expected_vector)
+# --- Tests for utils.py ---
+def test_dbg_output(capsys, monkeypatch):
+    """Testet die `dbg`-Funktion in beiden Zuständen."""
+    monkeypatch.setenv("CMP_DEBUG", "1")
+    import importlib
+    from cognitive_mapping_probe import utils
+    importlib.reload(utils)
+    utils.dbg("test message")
+    captured = capsys.readouterr()
+    assert "[DEBUG] test message" in captured.err
+    monkeypatch.delenv("CMP_DEBUG", raising=False)
+    importlib.reload(utils)
+    utils.dbg("should not be printed")
+    captured = capsys.readouterr()
+    assert captured.err == ""
+[File Ends] tests/test_components.py
+[File Begins] tests/test_orchestration.py
+import pandas as pd
+import pytest
+import torch
+from cognitive_mapping_probe.orchestrator_seismograph import run_seismic_analysis
+from cognitive_mapping_probe.auto_experiment import run_auto_suite, get_curated_experiments
+def test_run_seismic_analysis_no_injection(mocker, mock_llm):
+    """Testet den Orchestrator im Baseline-Modus."""
+    mock_run_seismic = mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.run_silent_cogitation_seismic', return_value=[1.0])
+    run_seismic_analysis(
+        model_id="mock", prompt_type="test", seed=42, num_steps=1,
+        concept_to_inject="", injection_strength=0.0, progress_callback=mocker.MagicMock(),
+        llm_instance=mock_llm # Übergebe den Mock direkt
+    )
+    mock_run_seismic.assert_called_once()
+def test_run_seismic_analysis_with_injection(mocker, mock_llm):
+    """Testet den Orchestrator mit Injektion."""
+    mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.run_silent_cogitation_seismic', return_value=[1.0])
+    mocker.patch('cognitive_mapping_probe.concepts.get_concept_vector', return_value=torch.randn(10)) # Patch im concepts-Modul
+    run_seismic_analysis(
+        model_id="mock", prompt_type="test", seed=42, num_steps=1,
+        concept_to_inject="test", injection_strength=1.5, progress_callback=mocker.MagicMock(),
+        llm_instance=mock_llm # Übergebe den Mock direkt
+    )
+def test_get_curated_experiments_structure():
+    """Testet die Datenstruktur der kuratierten Experimente."""
+    experiments = get_curated_experiments()
+    assert isinstance(experiments, dict)
+    assert "Therapeutic Intervention (4B-Model)" in experiments
+    protocol = experiments["Therapeutic Intervention (4B-Model)"]
+    assert isinstance(protocol, list) and len(protocol) > 0
+def test_run_auto_suite_special_protocol(mocker, mock_llm):
+    """
+    Testet den speziellen Logik-Pfad für das Interventions-Protokoll.
+    KORRIGIERT: Verwendet nun die `mock_llm`-Fixture und patcht `get_or_load_model`
+    im `auto_experiment`-Modul, um den Netzwerkaufruf zu verhindern.
+    """
+    # Patch `get_or_load_model` im `auto_experiment` Modul, da dort der erste Aufruf stattfindet
+    mocker.patch('cognitive_mapping_probe.auto_experiment.get_or_load_model', return_value=mock_llm)
+    mock_analysis = mocker.patch('cognitive_mapping_probe.auto_experiment.run_seismic_analysis', return_value={"stats": {}, "state_deltas": []})
+    run_auto_suite(
+        model_id="mock-4b", num_steps=1, seed=42,
+        experiment_name="Therapeutic Intervention (4B-Model)",
+        progress_callback=mocker.MagicMock()
+    )
+    assert mock_analysis.call_count == 2
+    first_call_llm = mock_analysis.call_args_list[0].kwargs['llm_instance']
+    second_call_llm = mock_analysis.call_args_list[1].kwargs['llm_instance']
+    assert first_call_llm is mock_llm
+    assert second_call_llm is mock_llm
+[File Ends] tests/test_orchestration.py
 <-- File Content Ends