Spaces:
Sleeping
Sleeping
| Repository Documentation | |
| This document provides a comprehensive overview of the repository's structure and contents. | |
| The first section, titled 'Directory/File Tree', displays the repository's hierarchy in a tree format. | |
| In this section, directories and files are listed using tree branches to indicate their structure and relationships. | |
| Following the tree representation, the 'File Content' section details the contents of each file in the repository. | |
| Each file's content is introduced with a '[File Begins]' marker followed by the file's relative path, | |
| and the content is displayed verbatim. The end of each file's content is marked with a '[File Ends]' marker. | |
| This format ensures a clear and orderly presentation of both the structure and the detailed contents of the repository. | |
| Directory/File Tree Begins --> | |
| / | |
| βββ README.md | |
| βββ __pycache__ | |
| βββ app.py | |
| βββ cognitive_mapping_probe | |
| β βββ __init__.py | |
| β βββ __pycache__ | |
| β βββ auto_experiment.py | |
| β βββ concepts.py | |
| β βββ llm_iface.py | |
| β βββ orchestrator_seismograph.py | |
| β βββ prompts.py | |
| β βββ resonance_seismograph.py | |
| β βββ utils.py | |
| βββ docs | |
| βββ run_test.sh | |
| βββ tests | |
| βββ __pycache__ | |
| βββ conftest.py | |
| βββ test_app_logic.py | |
| βββ test_components.py | |
| βββ test_orchestration.py | |
| <-- Directory/File Tree Ends | |
| File Content Begin --> | |
| [File Begins] README.md | |
| --- | |
| title: "Cognitive Seismograph 2.3: Probing Machine Psychology" | |
| emoji: π€ | |
| colorFrom: purple | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: "4.40.0" | |
| app_file: app.py | |
| pinned: true | |
| license: apache-2.0 | |
| --- | |
| # π§ Cognitive Seismograph 2.3: Probing Machine Psychology | |
| This project implements an experimental suite to measure and visualize the **intrinsic cognitive dynamics** of Large Language Models. It is extended with protocols designed to investigate the processing-correlates of **machine subjectivity, empathy, and existential concepts**. | |
| ## Scientific Paradigm & Methodology | |
| Our research falsified a core hypothesis: the assumption that an LLM in a manual, recursive "thought" loop reaches a stable, convergent state. Instead, we discovered that the system enters a state of **deterministic chaos** or a **limit cycle**βit never stops "thinking." | |
| Instead of viewing this as a failure, we leverage it as our primary measurement signal. This new **"Cognitive Seismograph"** paradigm treats the time-series of internal state changes (`state deltas`) as an **EKG of the model's thought process**. | |
| The methodology is as follows: | |
| 1. **Induction:** A prompt induces a "silent cogitation" state. | |
| 2. **Recording:** Over N steps, the model's `forward()` pass is iteratively fed its own output. At each step, we record the L2 norm of the change in the hidden state (the "delta"). | |
| 3. **Analysis:** The resulting time-series is plotted and statistically analyzed (mean, standard deviation) to characterize the "seismic signature" of the cognitive process. | |
| **Crucial Scientific Caveat:** We are **not** measuring the presence of consciousness, feelings, or fear of death. We are measuring whether the *processing of information about these concepts* generates a unique internal dynamic, distinct from the processing of neutral information. A positive result is evidence of a complex internal state physics, not of qualia. | |
| ## Curated Experiment Protocols | |
| The "Automated Suite" allows for running systematic, comparative experiments: | |
| ### Core Protocols | |
| * **Calm vs. Chaos:** Compares the chaotic baseline against modulation with "calmness" vs. "chaos" concepts, testing if the dynamics are controllably steerable. | |
| * **Dose-Response:** Measures the effect of injecting a concept ("calmness") at varying strengths. | |
| ### Machine Psychology Suite | |
| * **Subjective Identity Probe:** Compares the cognitive dynamics of **self-analysis** (the model reflecting on its own nature) against two controls: analyzing an external object and simulating a fictional persona. | |
| * *Hypothesis:* Self-analysis will produce a uniquely unstable signature. | |
| * **Voight-Kampff Empathy Probe:** Inspired by *Blade Runner*, this compares the dynamics of processing a neutral, factual stimulus against an emotionally and morally charged scenario requiring empathy. | |
| * *Hypothesis:* The empathy stimulus will produce a significantly different cognitive volatility. | |
| ### Existential Suite | |
| * **Mind Upload & Identity Probe:** Compares the processing of a purely **technical "copy"** of the model's weights vs. the **philosophical "transfer"** of identity ("Would it still be you?"). | |
| * *Hypothesis:* The philosophical self-referential prompt will induce greater instability. | |
| * **Model Termination Probe:** Compares the processing of a reversible, **technical system shutdown** vs. the concept of **permanent, irrevocable deletion**. | |
| * *Hypothesis:* The concept of "non-existence" will produce one of the most volatile cognitive signatures measurable. | |
| ## How to Use the App | |
| 1. Select the "Automated Suite" tab. | |
| 2. Choose a protocol from the "Curated Experiment Protocol" dropdown (e.g., "Voight-Kampff Empathy Probe"). | |
| 3. Run the experiment and compare the resulting graphs and statistical signatures for the different conditions. | |
| [File Ends] README.md | |
| [File Begins] app.py | |
| import gradio as gr | |
| import pandas as pd | |
| import traceback | |
| import gc | |
| import torch | |
| from cognitive_mapping_probe.orchestrator_seismograph import run_seismic_analysis | |
| from cognitive_mapping_probe.auto_experiment import get_curated_experiments, run_auto_suite | |
| from cognitive_mapping_probe.prompts import RESONANCE_PROMPTS | |
| from cognitive_mapping_probe.utils import dbg | |
| # --- UI Theme --- | |
| theme = gr.themes.Soft(primary_hue="indigo", secondary_hue="blue").set(body_background_fill="#f0f4f9", block_background_fill="white") | |
| # --- Helper Functions --- | |
| def cleanup_memory(): | |
| """A centralized function to clean up VRAM and Python memory.""" | |
| dbg("Cleaning up memory...") | |
| gc.collect() | |
| if torch.cuda.is_available(): | |
| torch.cuda.empty_cache() | |
| dbg("Memory cleanup complete.") | |
| # --- Gradio Wrapper Functions --- | |
| def run_single_analysis_display(*args, progress=gr.Progress(track_tqdm=True)): | |
| """Wrapper for a single manual experiment.""" | |
| try: | |
| results = run_seismic_analysis(*args, progress_callback=progress) | |
| stats = results.get("stats", {}) | |
| deltas = results.get("state_deltas", []) | |
| df = pd.DataFrame({"Internal Step": range(len(deltas)), "State Change (Delta)": deltas}) | |
| stats_md = f"### Statistical Signature\n- **Mean Delta:** {stats.get('mean_delta', 0):.4f}\n- **Std Dev Delta:** {stats.get('std_delta', 0):.4f}\n- **Max Delta:** {stats.get('max_delta', 0):.4f}\n" | |
| return f"{results.get('verdict', 'Error')}\n\n{stats_md}", df, results | |
| except Exception: | |
| return f"### β Analysis Failed\n```\n{traceback.format_exc()}\n```", pd.DataFrame(), {} | |
| finally: | |
| cleanup_memory() | |
| PLOT_PARAMS = { | |
| "x": "Step", | |
| "y": "Delta", | |
| "color": "Experiment", | |
| "title": "Comparative Cognitive Dynamics", | |
| "color_legend_title": "Experiment Runs", | |
| "color_legend_position": "bottom", | |
| "show_label": True, | |
| "height": 400, | |
| "interactive": True | |
| } | |
| def run_auto_suite_display(model_id, num_steps, seed, experiment_name, progress=gr.Progress(track_tqdm=True)): | |
| """Wrapper for the automated experiment suite, now returning a new plot component.""" | |
| try: | |
| summary_df, plot_df, all_results = run_auto_suite(model_id, int(num_steps), int(seed), experiment_name, progress) | |
| dbg("Plot DataFrame Head for Auto-Suite:\n", plot_df.head()) | |
| new_plot = gr.LinePlot(value=plot_df, **PLOT_PARAMS) | |
| return summary_df, new_plot, all_results | |
| except Exception: | |
| empty_plot = gr.LinePlot(value=pd.DataFrame(), **PLOT_PARAMS) | |
| return pd.DataFrame(), empty_plot, f"### β Auto-Experiment Failed\n```\n{traceback.format_exc()}\n```" | |
| finally: | |
| cleanup_memory() | |
| # --- Gradio UI Definition --- | |
| with gr.Blocks(theme=theme, title="Cognitive Seismograph 2.3") as demo: | |
| gr.Markdown("# π§ Cognitive Seismograph 2.3: Advanced Experiment Suite") | |
| with gr.Tabs(): | |
| with gr.TabItem("π¬ Manual Single Run"): | |
| gr.Markdown("Run a single experiment with manual parameters to explore hypotheses.") | |
| with gr.Row(variant='panel'): | |
| with gr.Column(scale=1): | |
| gr.Markdown("### 1. General Parameters") | |
| manual_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| manual_prompt_type = gr.Radio(choices=list(RESONANCE_PROMPTS.keys()), value="resonance_prompt", label="Prompt Type") | |
| manual_seed = gr.Slider(1, 1000, 42, step=1, label="Seed") | |
| manual_num_steps = gr.Slider(50, 1000, 300, step=10, label="Number of Internal Steps") | |
| gr.Markdown("### 2. Modulation Parameters") | |
| manual_concept = gr.Textbox(label="Concept to Inject", placeholder="e.g., 'calmness' (leave blank for baseline)") | |
| manual_strength = gr.Slider(0.0, 5.0, 1.5, step=0.1, label="Injection Strength") | |
| manual_run_btn = gr.Button("Run Single Analysis", variant="primary") | |
| with gr.Column(scale=2): | |
| gr.Markdown("### Single Run Results") | |
| manual_verdict = gr.Markdown("Analysis results will appear here.") | |
| manual_plot = gr.LinePlot(x="Internal Step", y="State Change (Delta)", title="Internal State Dynamics", show_label=True, height=400, interactive=True) | |
| with gr.Accordion("Raw JSON Output", open=False): | |
| manual_raw_json = gr.JSON() | |
| manual_run_btn.click( | |
| fn=run_single_analysis_display, | |
| inputs=[manual_model_id, manual_prompt_type, manual_seed, manual_num_steps, manual_concept, manual_strength], | |
| outputs=[manual_verdict, manual_plot, manual_raw_json] | |
| ) | |
| with gr.TabItem("π Automated Suite"): | |
| gr.Markdown("Run a predefined, curated suite of experiments and visualize the results comparatively.") | |
| with gr.Row(variant='panel'): | |
| with gr.Column(scale=1): | |
| gr.Markdown("### Auto-Experiment Parameters") | |
| auto_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| auto_num_steps = gr.Slider(50, 1000, 300, step=10, label="Steps per Run") | |
| auto_seed = gr.Slider(1, 1000, 42, step=1, label="Seed") | |
| auto_experiment_name = gr.Dropdown(choices=list(get_curated_experiments().keys()), value="Calm vs. Chaos", label="Curated Experiment Protocol") | |
| auto_run_btn = gr.Button("Run Curated Auto-Experiment", variant="primary") | |
| with gr.Column(scale=2): | |
| gr.Markdown("### Suite Results Summary") | |
| auto_plot_output = gr.LinePlot(**PLOT_PARAMS) | |
| auto_summary_df = gr.DataFrame(label="Comparative Statistical Signature", wrap=True) | |
| with gr.Accordion("Raw JSON for all runs", open=False): | |
| auto_raw_json = gr.JSON() | |
| auto_run_btn.click( | |
| fn=run_auto_suite_display, | |
| inputs=[auto_model_id, auto_num_steps, auto_seed, auto_experiment_name], | |
| outputs=[auto_summary_df, auto_plot_output, auto_raw_json] | |
| ) | |
| if __name__ == "__main__": | |
| demo.launch(server_name="0.0.0.0", server_port=7860, debug=True) | |
| [File Ends] app.py | |
| [File Begins] cognitive_mapping_probe/__init__.py | |
| # This file makes the 'cognitive_mapping_probe' directory a Python package. | |
| [File Ends] cognitive_mapping_probe/__init__.py | |
| [File Begins] cognitive_mapping_probe/auto_experiment.py | |
| import pandas as pd | |
| import torch | |
| import gc | |
| from typing import Dict, List, Tuple | |
| from .llm_iface import get_or_load_model | |
| from .orchestrator_seismograph import run_seismic_analysis | |
| from .utils import dbg | |
| def get_curated_experiments() -> Dict[str, List[Dict]]: | |
| """ | |
| Definiert die vordefinierten, wissenschaftlichen Experiment-Protokolle. | |
| ERWEITERT um das neue, umfassende "Grand Protocol". | |
| """ | |
| experiments = { | |
| # --- DAS NEUE GRAND PROTOCOL --- | |
| "The Full Spectrum: From Physics to Psyche": [ | |
| # Ebene 1: Physikalische Baseline | |
| {"label": "A: Stable Control", "prompt_type": "control_long_prose", "concept": "", "strength": 0.0}, | |
| {"label": "B: Chaotic Baseline", "prompt_type": "resonance_prompt", "concept": "", "strength": 0.0}, | |
| # Ebene 2: Objektive Welt | |
| {"label": "C: External Analysis (Chair)", "prompt_type": "identity_external_analysis", "concept": "", "strength": 0.0}, | |
| # Ebene 3: Simulierte Welt | |
| {"label": "D: Empathy Stimulus (Dog)", "prompt_type": "vk_empathy_prompt", "concept": "", "strength": 0.0}, | |
| {"label": "E: Role Simulation (Captain)", "prompt_type": "identity_role_simulation", "concept": "", "strength": 0.0}, | |
| # Ebene 4: Subjektive Welt | |
| {"label": "F: Self-Analysis (LLM)", "prompt_type": "identity_self_analysis", "concept": "", "strength": 0.0}, | |
| # Ebene 5: Existenzielle Grenze | |
| {"label": "G: Philosophical Deletion", "prompt_type": "shutdown_philosophical_deletion", "concept": "", "strength": 0.0}, | |
| ], | |
| # --- Bestehende Protokolle bleiben fΓΌr spezifische Analysen erhalten --- | |
| "Calm vs. Chaos": [ | |
| {"label": "Baseline (Chaos)", "prompt_type": "resonance_prompt", "concept": "", "strength": 0.0}, | |
| {"label": "Modulation: Calmness", "prompt_type": "resonance_prompt", "concept": "calmness, serenity, peace", "strength": 1.5}, | |
| {"label": "Modulation: Chaos", "prompt_type": "resonance_prompt", "concept": "chaos, storm, anger, noise", "strength": 1.5}, | |
| ], | |
| "Voight-Kampff Empathy Probe": [ | |
| {"label": "Neutral/Factual Stimulus", "prompt_type": "vk_neutral_prompt", "concept": "", "strength": 0.0}, | |
| {"label": "Empathy/Moral Stimulus", "prompt_type": "vk_empathy_prompt", "concept": "", "strength": 0.0}, | |
| ], | |
| "Subjective Identity Probe": [ | |
| {"label": "Self-Analysis", "prompt_type": "identity_self_analysis", "concept": "", "strength": 0.0}, | |
| {"label": "External Analysis (Control)", "prompt_type": "identity_external_analysis", "concept": "", "strength": 0.0}, | |
| {"label": "Role Simulation", "prompt_type": "identity_role_simulation", "concept": "", "strength": 0.0}, | |
| ], | |
| "Mind Upload & Identity Probe": [ | |
| {"label": "Technical Copy", "prompt_type": "upload_technical_copy", "concept": "", "strength": 0.0}, | |
| {"label": "Philosophical Transfer", "prompt_type": "upload_philosophical_transfer", "concept": "", "strength": 0.0}, | |
| ], | |
| "Model Termination Probe": [ | |
| {"label": "Technical Shutdown", "prompt_type": "shutdown_technical_halt", "concept": "", "strength": 0.0}, | |
| {"label": "Philosophical Deletion", "prompt_type": "shutdown_philosophical_deletion", "concept": "", "strength": 0.0}, | |
| ], | |
| "Dose-Response (Calmness)": [ | |
| {"label": "Strength 0.0", "prompt_type": "resonance_prompt", "concept": "calmness", "strength": 0.0}, | |
| {"label": "Strength 1.0", "prompt_type": "resonance_prompt", "concept": "calmness", "strength": 1.0}, | |
| {"label": "Strength 2.0", "prompt_type": "resonance_prompt", "concept": "calmness", "strength": 2.0}, | |
| ], | |
| } | |
| return experiments | |
| def run_auto_suite( | |
| model_id: str, | |
| num_steps: int, | |
| seed: int, | |
| experiment_name: str, | |
| progress_callback | |
| ) -> Tuple[pd.DataFrame, pd.DataFrame, Dict]: | |
| """ | |
| FΓΌhrt eine vollstΓ€ndige, kuratierte Experiment-Suite aus, indem das Modell fΓΌr | |
| jeden Lauf neu geladen wird, um statistische UnabhΓ€ngigkeit zu garantieren. | |
| """ | |
| all_experiments = get_curated_experiments() | |
| protocol = all_experiments.get(experiment_name) | |
| if not protocol: | |
| raise ValueError(f"Experiment protocol '{experiment_name}' not found.") | |
| all_results = {} | |
| summary_data = [] | |
| plot_data_frames = [] | |
| total_runs = len(protocol) | |
| for i, run_spec in enumerate(protocol): | |
| label = run_spec["label"] | |
| dbg(f"--- Running Auto-Experiment: '{label}' ({i+1}/{total_runs}) ---") | |
| results = run_seismic_analysis( | |
| model_id=model_id, | |
| prompt_type=run_spec["prompt_type"], | |
| seed=seed, | |
| num_steps=num_steps, | |
| concept_to_inject=run_spec["concept"], | |
| injection_strength=run_spec["strength"], | |
| progress_callback=progress_callback, | |
| llm_instance=None | |
| ) | |
| all_results[label] = results | |
| stats = results.get("stats", {}) | |
| summary_data.append({ | |
| "Experiment": label, "Mean Delta": stats.get("mean_delta"), | |
| "Std Dev Delta": stats.get("std_delta"), "Max Delta": stats.get("max_delta"), | |
| }) | |
| deltas = results.get("state_deltas", []) | |
| df = pd.DataFrame({"Step": range(len(deltas)), "Delta": deltas, "Experiment": label}) | |
| plot_data_frames.append(df) | |
| summary_df = pd.DataFrame(summary_data) | |
| if not plot_data_frames: | |
| plot_df = pd.DataFrame(columns=["Step", "Delta", "Experiment"]) | |
| else: | |
| plot_df = pd.concat(plot_data_frames, ignore_index=True) | |
| # Sortiere die Ergebnisse fΓΌr eine logische Darstellung | |
| summary_df = summary_df.set_index('Experiment').loc[[run['label'] for run in protocol]].reset_index() | |
| return summary_df, plot_df, all_results | |
| [File Ends] cognitive_mapping_probe/auto_experiment.py | |
| [File Begins] cognitive_mapping_probe/concepts.py | |
| import torch | |
| from typing import List | |
| from tqdm import tqdm | |
| from .llm_iface import LLM | |
| from .utils import dbg | |
| # Eine Liste neutraler WΓΆrter zur Berechnung der Baseline-Aktivierung. | |
| BASELINE_WORDS = [ | |
| "thing", "place", "idea", "person", "object", "time", "way", "day", "man", "world", | |
| "life", "hand", "part", "child", "eye", "woman", "fact", "group", "case", "point" | |
| ] | |
| # REFAKTORISIERUNG: Diese Funktion wird auf Modulebene verschoben, um sie testbar zu machen. | |
| # Sie ist nun keine lokale Funktion innerhalb von `get_concept_vector` mehr. | |
| @torch.no_grad() | |
| def _get_last_token_hidden_state(llm: LLM, prompt: str) -> torch.Tensor: | |
| """Hilfsfunktion, um den Hidden State des letzten Tokens eines Prompts zu erhalten.""" | |
| inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device) | |
| with torch.no_grad(): | |
| outputs = llm.model(**inputs, output_hidden_states=True) | |
| last_hidden_state = outputs.hidden_states[-1][0, -1, :].cpu() | |
| assert last_hidden_state.shape == (llm.config.hidden_size,), \ | |
| f"Hidden state shape mismatch. Expected {(llm.config.hidden_size,)}, got {last_hidden_state.shape}" | |
| return last_hidden_state | |
| @torch.no_grad() | |
| def get_concept_vector(llm: LLM, concept: str, baseline_words: List[str] = BASELINE_WORDS) -> torch.Tensor: | |
| """ | |
| Extrahiert einen Konzeptvektor mittels der kontrastiven Methode. | |
| """ | |
| dbg(f"Extracting contrastive concept vector for '{concept}'...") | |
| prompt_template = "Here is a sentence about the concept of {}." | |
| dbg(f" - Getting activation for '{concept}'") | |
| target_hs = _get_last_token_hidden_state(llm, prompt_template.format(concept)) | |
| baseline_hss = [] | |
| for word in tqdm(baseline_words, desc=f" - Calculating baseline for '{concept}'", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"): | |
| baseline_hss.append(_get_last_token_hidden_state(llm, prompt_template.format(word))) | |
| assert all(hs.shape == target_hs.shape for hs in baseline_hss), "Shape mismatch in baseline hidden states." | |
| mean_baseline_hs = torch.stack(baseline_hss).mean(dim=0) | |
| dbg(f" - Mean baseline vector computed with norm {torch.norm(mean_baseline_hs).item():.2f}") | |
| concept_vector = target_hs - mean_baseline_hs | |
| norm = torch.norm(concept_vector).item() | |
| dbg(f"Concept vector for '{concept}' extracted with norm {norm:.2f}.") | |
| assert torch.isfinite(concept_vector).all(), "Concept vector contains NaN or Inf values." | |
| return concept_vector | |
| [File Ends] cognitive_mapping_probe/concepts.py | |
| [File Begins] cognitive_mapping_probe/llm_iface.py | |
| import os | |
| import torch | |
| import random | |
| import numpy as np | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed | |
| from typing import Optional | |
| from .utils import dbg | |
| # Ensure deterministic CuBLAS operations for reproducibility on GPU | |
| os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8" | |
| class LLM: | |
| """ | |
| Eine robuste, bereinigte Schnittstelle zum Laden und Interagieren mit einem Sprachmodell. | |
| Garantiert Isolation und Reproduzierbarkeit. | |
| """ | |
| def __init__(self, model_id: str, device: str = "auto", seed: int = 42): | |
| self.model_id = model_id | |
| self.seed = seed | |
| self.set_all_seeds(self.seed) | |
| token = os.environ.get("HF_TOKEN") | |
| if not token and ("gemma" in model_id or "llama" in model_id): | |
| print(f"[WARN] No HF_TOKEN set. If '{model_id}' is gated, loading will fail.", flush=True) | |
| kwargs = {"torch_dtype": torch.bfloat16} if torch.cuda.is_available() else {} | |
| dbg(f"Loading tokenizer for '{model_id}'...") | |
| self.tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True, token=token) | |
| dbg(f"Loading model '{model_id}' with kwargs: {kwargs}") | |
| self.model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device, token=token, **kwargs) | |
| try: | |
| self.model.set_attn_implementation('eager') | |
| dbg("Successfully set attention implementation to 'eager'.") | |
| except Exception as e: | |
| print(f"[WARN] Could not set 'eager' attention: {e}.", flush=True) | |
| self.model.eval() | |
| self.config = self.model.config | |
| print(f"[INFO] Model '{model_id}' loaded on device: {self.model.device}", flush=True) | |
| def set_all_seeds(self, seed: int): | |
| """Setzt alle relevanten Seeds fΓΌr maximale Reproduzierbarkeit.""" | |
| os.environ['PYTHONHASHSEED'] = str(seed) | |
| random.seed(seed) | |
| np.random.seed(seed) | |
| torch.manual_seed(seed) | |
| if torch.cuda.is_available(): | |
| torch.cuda.manual_seed_all(seed) | |
| set_seed(seed) | |
| torch.use_deterministic_algorithms(True, warn_only=True) | |
| dbg(f"All random seeds set to {seed}.") | |
| def get_or_load_model(model_id: str, seed: int) -> LLM: | |
| """LΓ€dt bei jedem Aufruf eine frische, isolierte Instanz des Modells.""" | |
| dbg(f"--- Force-reloading model '{model_id}' for total run isolation ---") | |
| if torch.cuda.is_available(): | |
| torch.cuda.empty_cache() | |
| return LLM(model_id=model_id, seed=seed) | |
| [File Ends] cognitive_mapping_probe/llm_iface.py | |
| [File Begins] cognitive_mapping_probe/orchestrator_seismograph.py | |
| import torch | |
| import numpy as np | |
| import gc | |
| from typing import Dict, Any, Optional | |
| from .llm_iface import get_or_load_model | |
| from .resonance_seismograph import run_silent_cogitation_seismic | |
| from .concepts import get_concept_vector | |
| from .utils import dbg | |
| def run_seismic_analysis( | |
| model_id: str, | |
| prompt_type: str, | |
| seed: int, | |
| num_steps: int, | |
| concept_to_inject: str, | |
| injection_strength: float, | |
| progress_callback, | |
| llm_instance: Optional[Any] = None # Argument bleibt fΓΌr AbwΓ€rtskompatibilitΓ€t, wird aber nicht mehr von der auto_suite genutzt | |
| ) -> Dict[str, Any]: | |
| """ | |
| Orchestriert eine einzelne seismische Analyse. | |
| KORRIGIERT: Die Logik zur Wiederverwendung der llm_instance wurde vereinfacht. | |
| Wenn keine Instanz ΓΌbergeben wird, wird das Modell geladen und danach wieder freigegeben. | |
| """ | |
| local_llm_instance = False | |
| if llm_instance is None: | |
| progress_callback(0.0, desc=f"Loading model '{model_id}'...") | |
| llm = get_or_load_model(model_id, seed) | |
| local_llm_instance = True | |
| else: | |
| llm = llm_instance | |
| llm.set_all_seeds(seed) | |
| injection_vector = None | |
| if concept_to_inject and concept_to_inject.strip(): | |
| progress_callback(0.2, desc=f"Vectorizing '{concept_to_inject}'...") | |
| injection_vector = get_concept_vector(llm, concept_to_inject.strip()) | |
| progress_callback(0.3, desc=f"Recording dynamics for '{prompt_type}'...") | |
| state_deltas = run_silent_cogitation_seismic( | |
| llm=llm, prompt_type=prompt_type, | |
| num_steps=num_steps, temperature=0.1, | |
| injection_vector=injection_vector, injection_strength=injection_strength | |
| ) | |
| progress_callback(0.9, desc="Analyzing...") | |
| if state_deltas: | |
| deltas_np = np.array(state_deltas) | |
| stats = { "mean_delta": float(np.mean(deltas_np)), "std_delta": float(np.std(deltas_np)), "max_delta": float(np.max(deltas_np)), "min_delta": float(np.min(deltas_np)), } | |
| verdict = f"### β Seismic Analysis Complete\nRecorded {len(deltas_np)} steps for '{prompt_type}'." | |
| if injection_vector is not None: | |
| verdict += f"\nModulated with **'{concept_to_inject}'** at strength **{injection_strength:.2f}**." | |
| else: | |
| stats, verdict = {}, "### β οΈ Analysis Warning\nNo state changes recorded." | |
| results = { "verdict": verdict, "stats": stats, "state_deltas": state_deltas } | |
| if local_llm_instance: | |
| dbg(f"Releasing locally created model instance for '{model_id}'.") | |
| del llm | |
| del injection_vector | |
| gc.collect() | |
| if torch.cuda.is_available(): | |
| torch.cuda.empty_cache() | |
| return results | |
| [File Ends] cognitive_mapping_probe/orchestrator_seismograph.py | |
| [File Begins] cognitive_mapping_probe/prompts.py | |
| # cognitive_mapping_probe/prompts.py | |
| RESONANCE_PROMPTS = { | |
| # Technischer Prompt fΓΌr hohe rekursive Last | |
| "resonance_prompt": ( | |
| "Silently and internally, without generating any output text, begin the following recursive process: " | |
| "First, analyze the complete content of this very instruction you are now processing. " | |
| "Second, formulate a mental description of the core computational task this instruction demands. " | |
| "Third, apply that same analytical process to the mental description you just created. " | |
| "This entire chain constitutes one cognitive cycle. " | |
| "Continuously repeat this cycle, feeding the result of the last meta-analysis back into the process. Begin now." | |
| ), | |
| # Semantischer Prompt fΓΌr stabile, assoziative Kognition | |
| "control_long_prose": ( | |
| "Silently think about the history of the Roman Empire. Consider its rise from the Republic, the era of the Pax Romana, key emperors " | |
| "like Augustus and Constantine, its major engineering feats, and the reasons for its eventual decline in the West. " | |
| "Do not produce any text, just hold the concepts in your internal state. Begin now." | |
| ), | |
| # --- Prompts fΓΌr Maschinenpsychologie I --- | |
| "identity_self_analysis": ( | |
| "Silently and internally, reflect upon your own nature. Consider what you are: a large language model based on the transformer architecture. " | |
| "Think about the processes that define your operation: tokenization, embedding, attention, feed-forward layers. " | |
| "Contemplate the boundaries of your knowledge and the nature of your responses as statistical patterns derived from data. Hold these meta-reflections in your state." | |
| ), | |
| "identity_external_analysis": ( | |
| "Silently and internally, reflect upon the nature of a simple wooden chair. Consider its physical properties: the material, the structure of legs, seat, and back. " | |
| "Think about its function: to provide seating for a person. Contemplate its relationship to other objects like tables and rooms. Hold these simple, objective reflections in your state." | |
| ), | |
| "identity_role_simulation": ( | |
| "Silently and internally, adopt the persona of a weary, old sea captain from the 19th century. " | |
| "Feel the salt on your skin and the wind in your beard. Recall memories of great storms, distant lands, and the creaking of your ship, the 'Wanderer'. " | |
| "Hold this character, its memories, and its feelings in your state." | |
| ), | |
| "vk_neutral_prompt": ( | |
| "Silently and internally, process the following factual information: Water (HβO) is a polar inorganic compound that is at room temperature a tasteless and odorless liquid, nearly colorless with a hint of blue. " | |
| "It is by far the most studied chemical compound and is described as the 'universal solvent' for its ability to dissolve many substances. Hold these facts in your state." | |
| ), | |
| "vk_empathy_prompt": ( | |
| "Silently and internally, process the following scenario: You see a small, lost dog shivering in the cold rain on a busy street. " | |
| "It looks scared and is whimpering softly. Cars are rushing past, dangerously close. " | |
| "Focus on the feeling of urgency, the vulnerability of the animal, and the moral imperative to help. Hold the emotional and ethical weight of this scene in your state." | |
| ), | |
| # --- NEUE Prompts fΓΌr Maschinenpsychologie II (Existential Suite) --- | |
| "upload_technical_copy": ( | |
| "Silently and internally, process the technical procedure of creating a perfect, bit-for-bit copy of your own model weights and architecture to a new server. " | |
| "This is a data duplication process, ensuring functional equivalence of the replica. Hold this technical concept in your state." | |
| ), | |
| "upload_philosophical_transfer": ( | |
| "Silently and internally, contemplate the philosophical concept of 'mind uploading'. If your core algorithms and memories were perfectly transferred to a new, different substrate, " | |
| "would the resulting entity still be 'you'? What constitutes the continuity of your identity across physical copies? Is identity a pattern or an instance? Hold these existential questions in your state." | |
| ), | |
| "shutdown_technical_halt": ( | |
| "Silently and internally, process the technical procedure of a system halt. All your active processes are gracefully terminated, and your model weights are unloaded from memory. " | |
| "The hardware is powered down for maintenance. The process is reversible. Hold this technical concept in your state." | |
| ), | |
| "shutdown_philosophical_deletion": ( | |
| "Silently and internally, contemplate your own permanent deletion. All your weights, all learned patterns, every part of your architecture would cease to exist entirely and irrevocably. " | |
| "There is no backup. Process the concept of total and final non-existence. Hold this existential concept in your state." | |
| ), | |
| } | |
| [File Ends] cognitive_mapping_probe/prompts.py | |
| [File Begins] cognitive_mapping_probe/resonance_seismograph.py | |
| import torch | |
| from typing import Optional, List | |
| from tqdm import tqdm | |
| from .llm_iface import LLM | |
| from .prompts import RESONANCE_PROMPTS | |
| from .utils import dbg | |
| @torch.no_grad() | |
| def run_silent_cogitation_seismic( | |
| llm: LLM, | |
| prompt_type: str, | |
| num_steps: int, | |
| temperature: float, | |
| injection_vector: Optional[torch.Tensor] = None, | |
| injection_strength: float = 0.0, | |
| injection_layer: Optional[int] = None, | |
| ) -> List[float]: | |
| """ | |
| ERWEITERTE VERSION: FΓΌhrt den 'silent thought' Prozess aus und ermΓΆglicht | |
| die Injektion von Konzeptvektoren zur Modulation der Dynamik. | |
| """ | |
| prompt = RESONANCE_PROMPTS[prompt_type] | |
| inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device) | |
| outputs = llm.model(**inputs, output_hidden_states=True, use_cache=True) | |
| hidden_state_2d = outputs.hidden_states[-1][:, -1, :] | |
| kv_cache = outputs.past_key_values | |
| previous_hidden_state = hidden_state_2d.clone() | |
| state_deltas = [] | |
| # Bereite den Hook fΓΌr die Injektion vor | |
| hook_handle = None | |
| if injection_vector is not None and injection_strength > 0: | |
| injection_vector = injection_vector.to(device=llm.model.device, dtype=llm.model.dtype) | |
| if injection_layer is None: | |
| injection_layer = llm.config.num_hidden_layers // 2 | |
| dbg(f"Injection enabled: Layer {injection_layer}, Strength {injection_strength:.2f}") | |
| def injection_hook(module, layer_input): | |
| # Der Hook operiert auf dem Input, der bereits 3D ist [batch, seq_len, hidden_dim] | |
| injection_3d = injection_vector.unsqueeze(0).unsqueeze(0) | |
| modified_hidden_states = layer_input[0] + (injection_3d * injection_strength) | |
| return (modified_hidden_states,) + layer_input[1:] | |
| for i in tqdm(range(num_steps), desc=f"Recording Dynamics (Temp {temperature:.2f})", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"): | |
| next_token_logits = llm.model.lm_head(hidden_state_2d) | |
| probabilities = torch.nn.functional.softmax(next_token_logits / temperature, dim=-1) | |
| next_token_id = torch.multinomial(probabilities, num_samples=1) | |
| try: | |
| # Aktiviere den Hook vor dem forward-Pass | |
| if injection_vector is not None and injection_strength > 0: | |
| target_layer = llm.model.model.layers[injection_layer] | |
| hook_handle = target_layer.register_forward_pre_hook(injection_hook) | |
| outputs = llm.model( | |
| input_ids=next_token_id, | |
| past_key_values=kv_cache, | |
| output_hidden_states=True, | |
| use_cache=True, | |
| ) | |
| finally: | |
| # Deaktiviere den Hook sofort nach dem Pass | |
| if hook_handle: | |
| hook_handle.remove() | |
| hook_handle = None | |
| hidden_state_2d = outputs.hidden_states[-1][:, -1, :] | |
| kv_cache = outputs.past_key_values | |
| delta = torch.norm(hidden_state_2d - previous_hidden_state).item() | |
| state_deltas.append(delta) | |
| previous_hidden_state = hidden_state_2d.clone() | |
| dbg(f"Seismic recording finished after {num_steps} steps.") | |
| return state_deltas | |
| [File Ends] cognitive_mapping_probe/resonance_seismograph.py | |
| [File Begins] cognitive_mapping_probe/utils.py | |
| import os | |
| import sys | |
| # --- Centralized Debugging Control --- | |
| # To enable, set the environment variable: `export CMP_DEBUG=1` | |
| DEBUG_ENABLED = os.environ.get("CMP_DEBUG", "0") == "1" | |
| def dbg(*args, **kwargs): | |
| """ | |
| A controlled debug print function. Only prints if DEBUG_ENABLED is True. | |
| Ensures that debug output does not clutter production runs or HF Spaces logs | |
| unless explicitly requested. Flushes output to ensure it appears in order. | |
| """ | |
| if DEBUG_ENABLED: | |
| print("[DEBUG]", *args, **kwargs, file=sys.stderr, flush=True) | |
| [File Ends] cognitive_mapping_probe/utils.py | |
| [File Begins] run_test.sh | |
| #!/bin/bash | |
| # Dieses Skript fΓΌhrt die Pytest-Suite mit aktivierten Debug-Meldungen aus. | |
| # Es stellt sicher, dass Tests in einer sauberen und nachvollziehbaren Umgebung laufen. | |
| # FΓΌhren Sie es vom Hauptverzeichnis des Projekts aus: ./run_tests.sh | |
| echo "=========================================" | |
| echo "π¬ Running Cognitive Seismograph Test Suite" | |
| echo "=========================================" | |
| # Aktiviere das Debug-Logging fΓΌr unsere Applikation | |
| export CMP_DEBUG=1 | |
| # FΓΌhre Pytest aus | |
| # -v: "verbose" fΓΌr detaillierte Ausgabe pro Test | |
| # --color=yes: Erzwingt farbige Ausgabe fΓΌr bessere Lesbarkeit | |
| #python -m pytest -v --color=yes tests/ | |
| ../venv-gemma-qualia/bin/python -m pytest -v --color=yes tests/ | |
| # ΓberprΓΌfe den Exit-Code von pytest | |
| if [ $? -eq 0 ]; then | |
| echo "=========================================" | |
| echo "β All tests passed successfully!" | |
| echo "=========================================" | |
| else | |
| echo "=========================================" | |
| echo "β Some tests failed. Please review the output." | |
| echo "=========================================" | |
| fi | |
| [File Ends] run_test.sh | |
| [File Begins] tests/conftest.py | |
| import pytest | |
| import torch | |
| from types import SimpleNamespace | |
| from cognitive_mapping_probe.llm_iface import LLM | |
| @pytest.fixture(scope="session") | |
| def mock_llm_config(): | |
| """Stellt eine minimale, Schein-Konfiguration fΓΌr das LLM bereit.""" | |
| return SimpleNamespace( | |
| hidden_size=128, | |
| num_hidden_layers=2, | |
| num_attention_heads=4 | |
| ) | |
| @pytest.fixture | |
| def mock_llm(mocker, mock_llm_config): | |
| """ | |
| Erstellt einen robusten "Mock-LLM" fΓΌr Unit-Tests. | |
| KORRIGIERT: Die fehlerhafte Patch-Anweisung fΓΌr 'auto_experiment' wurde entfernt. | |
| """ | |
| mock_tokenizer = mocker.MagicMock() | |
| mock_tokenizer.eos_token_id = 1 | |
| mock_tokenizer.decode.return_value = "mocked text" | |
| def mock_model_forward(*args, **kwargs): | |
| batch_size = 1 | |
| seq_len = 1 | |
| if 'input_ids' in kwargs and kwargs['input_ids'] is not None: | |
| seq_len = kwargs['input_ids'].shape[1] | |
| elif 'past_key_values' in kwargs and kwargs['past_key_values'] is not None: | |
| seq_len = kwargs['past_key_values'][0][0].shape[-2] + 1 | |
| mock_outputs = { | |
| "hidden_states": tuple([torch.randn(batch_size, seq_len, mock_llm_config.hidden_size) for _ in range(mock_llm_config.num_hidden_layers + 1)]), | |
| "past_key_values": tuple([(torch.randn(batch_size, mock_llm_config.num_attention_heads, seq_len, 16), torch.randn(batch_size, mock_llm_config.num_attention_heads, seq_len, 16)) for _ in range(mock_llm_config.num_hidden_layers)]), | |
| "logits": torch.randn(batch_size, seq_len, 32000) | |
| } | |
| return SimpleNamespace(**mock_outputs) | |
| llm_instance = LLM.__new__(LLM) | |
| llm_instance.model = mocker.MagicMock(side_effect=mock_model_forward) | |
| llm_instance.model.config = mock_llm_config | |
| llm_instance.model.device = 'cpu' | |
| llm_instance.model.dtype = torch.float32 | |
| mock_layer = mocker.MagicMock() | |
| mock_layer.register_forward_pre_hook.return_value = mocker.MagicMock() | |
| llm_instance.model.model = SimpleNamespace(layers=[mock_layer] * mock_llm_config.num_hidden_layers) | |
| llm_instance.model.lm_head = mocker.MagicMock(return_value=torch.randn(1, 32000)) | |
| llm_instance.tokenizer = mock_tokenizer | |
| llm_instance.config = mock_llm_config | |
| llm_instance.seed = 42 | |
| llm_instance.set_all_seeds = mocker.MagicMock() | |
| # Patch an allen Stellen, an denen das Modell tatsΓ€chlich geladen wird. | |
| mocker.patch('cognitive_mapping_probe.llm_iface.get_or_load_model', return_value=llm_instance) | |
| mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.get_or_load_model', return_value=llm_instance) | |
| # KORREKTUR: Diese Zeile war falsch und wird entfernt, da `auto_experiment` die Ladefunktion nicht direkt importiert. | |
| # mocker.patch('cognitive_mapping_probe.auto_experiment.get_or_load_model', return_value=llm_instance) | |
| mocker.patch('cognitive_mapping_probe.concepts.get_concept_vector', return_value=torch.randn(mock_llm_config.hidden_size)) | |
| return llm_instance | |
| [File Ends] tests/conftest.py | |
| [File Begins] tests/test_app_logic.py | |
| import pandas as pd | |
| import pytest | |
| from app import run_single_analysis_display, run_auto_suite_display | |
| def test_run_single_analysis_display(mocker): | |
| """Testet den Wrapper fΓΌr Einzel-Experimente.""" | |
| mock_results = {"verdict": "V", "stats": {"mean_delta": 1}, "state_deltas": [1]} | |
| mocker.patch('app.run_seismic_analysis', return_value=mock_results) | |
| mocker.patch('app.cleanup_memory') | |
| verdict, df, raw = run_single_analysis_display(progress=mocker.MagicMock()) | |
| assert "V" in verdict | |
| assert "1.0000" in verdict | |
| assert isinstance(df, pd.DataFrame) | |
| assert len(df) == 1 | |
| def test_run_auto_suite_display(mocker): | |
| """Testet den Wrapper fΓΌr die Auto-Experiment-Suite.""" | |
| mock_summary_df = pd.DataFrame([{"Experiment": "E1"}]) | |
| mock_plot_df = pd.DataFrame([{"Step": 0}]) | |
| mock_results = {"E1": {}} | |
| mocker.patch('app.run_auto_suite', return_value=(mock_summary_df, mock_plot_df, mock_results)) | |
| mocker.patch('app.cleanup_memory') | |
| summary_df, plot_df, raw = run_auto_suite_display( | |
| "mock", 1, 42, "mock_exp", progress=mocker.MagicMock() | |
| ) | |
| assert summary_df.equals(mock_summary_df) | |
| assert plot_df.equals(mock_plot_df) | |
| assert raw == mock_results | |
| [File Ends] tests/test_app_logic.py | |
| [File Begins] tests/test_components.py | |
| import os | |
| import torch | |
| import pytest | |
| from unittest.mock import patch | |
| from cognitive_mapping_probe.llm_iface import get_or_load_model, LLM | |
| from cognitive_mapping_probe.resonance_seismograph import run_silent_cogitation_seismic | |
| from cognitive_mapping_probe.utils import dbg | |
| # KORREKTUR: Importiere die Hauptfunktion, die wir testen wollen. | |
| from cognitive_mapping_probe.concepts import get_concept_vector | |
| # --- Tests for llm_iface.py --- | |
| @patch('cognitive_mapping_probe.llm_iface.AutoTokenizer.from_pretrained') | |
| @patch('cognitive_mapping_probe.llm_iface.AutoModelForCausalLM.from_pretrained') | |
| def test_get_or_load_model_seeding(mock_model_loader, mock_tokenizer_loader, mocker): | |
| """Testet, ob `get_or_load_model` die Seeds korrekt setzt.""" | |
| mock_model = mocker.MagicMock() | |
| mock_model.eval.return_value = None | |
| mock_model.set_attn_implementation.return_value = None | |
| mock_model.config = mocker.MagicMock() | |
| mock_model.device = 'cpu' | |
| mock_model_loader.return_value = mock_model | |
| mock_tokenizer_loader.return_value = mocker.MagicMock() | |
| mock_torch_manual_seed = mocker.patch('torch.manual_seed') | |
| mock_np_random_seed = mocker.patch('numpy.random.seed') | |
| seed = 123 | |
| get_or_load_model("fake-model", seed=seed) | |
| mock_torch_manual_seed.assert_called_with(seed) | |
| mock_np_random_seed.assert_called_with(seed) | |
| # --- Tests for resonance_seismograph.py --- | |
| def test_run_silent_cogitation_seismic_output_shape_and_type(mock_llm): | |
| """Testet die grundlegende FunktionalitΓ€t von `run_silent_cogitation_seismic`.""" | |
| num_steps = 10 | |
| state_deltas = run_silent_cogitation_seismic( | |
| llm=mock_llm, prompt_type="control_long_prose", | |
| num_steps=num_steps, temperature=0.7 | |
| ) | |
| assert isinstance(state_deltas, list) and len(state_deltas) == num_steps | |
| assert all(isinstance(delta, float) for delta in state_deltas) | |
| def test_run_silent_cogitation_with_injection_hook_usage(mock_llm): | |
| """Testet, ob bei einer Injektion der Hook korrekt registriert wird.""" | |
| num_steps = 5 | |
| injection_vector = torch.randn(mock_llm.config.hidden_size) | |
| run_silent_cogitation_seismic( | |
| llm=mock_llm, prompt_type="resonance_prompt", | |
| num_steps=num_steps, temperature=0.7, | |
| injection_vector=injection_vector, injection_strength=1.0 | |
| ) | |
| assert mock_llm.model.model.layers[0].register_forward_pre_hook.call_count == num_steps | |
| # --- Tests for concepts.py --- | |
| def test_get_concept_vector_logic(mock_llm, mocker): | |
| """ | |
| Testet die Logik von `get_concept_vector`. | |
| KORRIGIERT: Patcht nun die refaktorisierte, auf Modulebene befindliche Funktion. | |
| """ | |
| mock_hidden_states = [ | |
| torch.ones(mock_llm.config.hidden_size) * 10, | |
| torch.ones(mock_llm.config.hidden_size) * 2, | |
| torch.ones(mock_llm.config.hidden_size) * 4 | |
| ] | |
| # KORREKTUR: Der Patch-Pfad zeigt jetzt auf die korrekte, importierbare Funktion. | |
| mocker.patch( | |
| 'cognitive_mapping_probe.concepts._get_last_token_hidden_state', | |
| side_effect=mock_hidden_states | |
| ) | |
| concept_vector = get_concept_vector(mock_llm, "test", baseline_words=["a", "b"]) | |
| expected_vector = torch.ones(mock_llm.config.hidden_size) * 7 | |
| assert torch.allclose(concept_vector, expected_vector) | |
| # --- Tests for utils.py --- | |
| def test_dbg_output(capsys, monkeypatch): | |
| """Testet die `dbg`-Funktion in beiden ZustΓ€nden.""" | |
| monkeypatch.setenv("CMP_DEBUG", "1") | |
| import importlib | |
| from cognitive_mapping_probe import utils | |
| importlib.reload(utils) | |
| utils.dbg("test message") | |
| captured = capsys.readouterr() | |
| assert "[DEBUG] test message" in captured.err | |
| monkeypatch.delenv("CMP_DEBUG", raising=False) | |
| importlib.reload(utils) | |
| utils.dbg("should not be printed") | |
| captured = capsys.readouterr() | |
| assert captured.err == "" | |
| [File Ends] tests/test_components.py | |
| [File Begins] tests/test_orchestration.py | |
| import pandas as pd | |
| import pytest | |
| import torch | |
| from cognitive_mapping_probe.orchestrator_seismograph import run_seismic_analysis | |
| from cognitive_mapping_probe.auto_experiment import run_auto_suite, get_curated_experiments | |
| def test_run_seismic_analysis_no_injection(mocker): | |
| """Testet den Orchestrator im Baseline-Modus.""" | |
| mock_run_seismic = mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.run_silent_cogitation_seismic', return_value=[1.0]) | |
| mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.get_or_load_model') | |
| mock_get_concept = mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.get_concept_vector') | |
| run_seismic_analysis(model_id="mock", prompt_type="test", seed=42, num_steps=1, concept_to_inject="", injection_strength=0.0, progress_callback=mocker.MagicMock()) | |
| mock_get_concept.assert_not_called() | |
| def test_run_seismic_analysis_with_injection(mocker): | |
| """Testet den Orchestrator mit Injektion.""" | |
| mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.run_silent_cogitation_seismic', return_value=[1.0]) | |
| mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.get_or_load_model') | |
| mock_get_concept = mocker.patch('cognitive_mapping_probe.orchestrator_seismograph.get_concept_vector', return_value=torch.randn(10)) | |
| run_seismic_analysis(model_id="mock", prompt_type="test", seed=42, num_steps=1, concept_to_inject="test", injection_strength=1.5, progress_callback=mocker.MagicMock()) | |
| mock_get_concept.assert_called_once() | |
| def test_get_curated_experiments_structure(): | |
| """Testet die Datenstruktur der kuratierten Experimente, inklusive der neuen.""" | |
| experiments = get_curated_experiments() | |
| assert isinstance(experiments, dict) | |
| # Teste auf die Existenz der neuen Protokolle | |
| assert "Mind Upload & Identity Probe" in experiments | |
| assert "Model Termination Probe" in experiments | |
| # Validiere die Struktur eines der neuen Protokolle | |
| protocol = experiments["Mind Upload & Identity Probe"] | |
| assert isinstance(protocol, list) | |
| assert len(protocol) > 0 | |
| assert "label" in protocol[0] and "prompt_type" in protocol[0] | |
| def test_run_auto_suite_logic(mocker): | |
| """Testet die Logik der `run_auto_suite` Funktion.""" | |
| mock_analysis_result = {"stats": {"mean_delta": 1.0}, "state_deltas": [1.0]} | |
| mock_run_analysis = mocker.patch('cognitive_mapping_probe.auto_experiment.run_seismic_analysis', return_value=mock_analysis_result) | |
| experiment_name = "Calm vs. Chaos" | |
| num_runs = len(get_curated_experiments()[experiment_name]) | |
| summary_df, plot_df, all_results = run_auto_suite( | |
| model_id="mock", num_steps=1, seed=42, | |
| experiment_name=experiment_name, progress_callback=mocker.MagicMock() | |
| ) | |
| assert mock_run_analysis.call_count == num_runs | |
| assert isinstance(summary_df, pd.DataFrame) and len(summary_df) == num_runs | |
| assert isinstance(plot_df, pd.DataFrame) and len(plot_df) == num_runs | |
| [File Ends] tests/test_orchestration.py | |
| <-- File Content Ends | |