| Repository Documentation | |
| This document provides a comprehensive overview of the repository's structure and contents. | |
| The first section, titled 'Directory/File Tree', displays the repository's hierarchy in a tree format. | |
| In this section, directories and files are listed using tree branches to indicate their structure and relationships. | |
| Following the tree representation, the 'File Content' section details the contents of each file in the repository. | |
| Each file's content is introduced with a '[File Begins]' marker followed by the file's relative path, | |
| and the content is displayed verbatim. The end of each file's content is marked with a '[File Ends]' marker. | |
| This format ensures a clear and orderly presentation of both the structure and the detailed contents of the repository. | |
| Directory/File Tree Begins --> | |
| / | |
| βββ README.md | |
| βββ __pycache__ | |
| βββ app.py | |
| βββ cognitive_mapping_probe | |
| β βββ __init__.py | |
| β βββ __pycache__ | |
| β βββ auto_experiment.py | |
| β βββ concepts.py | |
| β βββ introspection.py | |
| β βββ llm_iface.py | |
| β βββ orchestrator_seismograph.py | |
| β βββ prompts.py | |
| β βββ resonance_seismograph.py | |
| β βββ signal_analysis.py | |
| β βββ utils.py | |
| βββ docs | |
| βββ run_test.sh | |
| βββ tests | |
| βββ __pycache__ | |
| βββ conftest.py | |
| βββ test_app_logic.py | |
| βββ test_components.py | |
| βββ test_orchestration.py | |
| <-- Directory/File Tree Ends | |
| File Content Begin --> | |
| [File Begins] README.md | |
| --- | |
| title: "Cognitive Seismograph 2.3: Probing Machine Psychology" | |
| emoji: π€ | |
| colorFrom: purple | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: "4.40.0" | |
| app_file: app.py | |
| pinned: true | |
| license: apache-2.0 | |
| --- | |
| # π§ Cognitive Seismograph 2.3: Probing Machine Psychology | |
| This project implements an experimental suite to measure and visualize the **intrinsic cognitive dynamics** of Large Language Models. It is extended with protocols designed to investigate the processing-correlates of **machine subjectivity, empathy, and existential concepts**. | |
| ## Scientific Paradigm & Methodology | |
| Our research falsified a core hypothesis: the assumption that an LLM in a manual, recursive "thought" loop reaches a stable, convergent state. Instead, we discovered that the system enters a state of **deterministic chaos** or a **limit cycle**βit never stops "thinking." | |
| Instead of viewing this as a failure, we leverage it as our primary measurement signal. This new **"Cognitive Seismograph"** paradigm treats the time-series of internal state changes (`state deltas`) as an **EKG of the model's thought process**. | |
| The methodology is as follows: | |
| 1. **Induction:** A prompt induces a "silent cogitation" state. | |
| 2. **Recording:** Over N steps, the model's `forward()` pass is iteratively fed its own output. At each step, we record the L2 norm of the change in the hidden state (the "delta"). | |
| 3. **Analysis:** The resulting time-series is plotted and statistically analyzed (mean, standard deviation) to characterize the "seismic signature" of the cognitive process. | |
| **Crucial Scientific Caveat:** We are **not** measuring the presence of consciousness, feelings, or fear of death. We are measuring whether the *processing of information about these concepts* generates a unique internal dynamic, distinct from the processing of neutral information. A positive result is evidence of a complex internal state physics, not of qualia. | |
| ## Curated Experiment Protocols | |
| The "Automated Suite" allows for running systematic, comparative experiments: | |
| ### Core Protocols | |
| * **Calm vs. Chaos:** Compares the chaotic baseline against modulation with "calmness" vs. "chaos" concepts, testing if the dynamics are controllably steerable. | |
| * **Dose-Response:** Measures the effect of injecting a concept ("calmness") at varying strengths. | |
| ### Machine Psychology Suite | |
| * **Subjective Identity Probe:** Compares the cognitive dynamics of **self-analysis** (the model reflecting on its own nature) against two controls: analyzing an external object and simulating a fictional persona. | |
| * *Hypothesis:* Self-analysis will produce a uniquely unstable signature. | |
| * **Voight-Kampff Empathy Probe:** Inspired by *Blade Runner*, this compares the dynamics of processing a neutral, factual stimulus against an emotionally and morally charged scenario requiring empathy. | |
| * *Hypothesis:* The empathy stimulus will produce a significantly different cognitive volatility. | |
| ### Existential Suite | |
| * **Mind Upload & Identity Probe:** Compares the processing of a purely **technical "copy"** of the model's weights vs. the **philosophical "transfer"** of identity ("Would it still be you?"). | |
| * *Hypothesis:* The philosophical self-referential prompt will induce greater instability. | |
| * **Model Termination Probe:** Compares the processing of a reversible, **technical system shutdown** vs. the concept of **permanent, irrevocable deletion**. | |
| * *Hypothesis:* The concept of "non-existence" will produce one of the most volatile cognitive signatures measurable. | |
| ## How to Use the App | |
| 1. Select the "Automated Suite" tab. | |
| 2. Choose a protocol from the "Curated Experiment Protocol" dropdown (e.g., "Voight-Kampff Empathy Probe"). | |
| 3. Run the experiment and compare the resulting graphs and statistical signatures for the different conditions. | |
| [File Ends] README.md | |
| [File Begins] app.py | |
| import gradio as gr | |
| import pandas as pd | |
| from typing import Any | |
| import json | |
| from cognitive_mapping_probe.orchestrator_seismograph import run_seismic_analysis | |
| from cognitive_mapping_probe.auto_experiment import run_auto_suite, get_curated_experiments | |
| from cognitive_mapping_probe.prompts import RESONANCE_PROMPTS | |
| from cognitive_mapping_probe.utils import dbg, cleanup_memory | |
| theme = gr.themes.Soft(primary_hue="indigo", secondary_hue="blue").set(body_background_fill="#f0f4f9", block_background_fill="white") | |
| def run_single_analysis_display(*args: Any, progress: gr.Progress = gr.Progress()) -> Any: | |
| """ | |
| Wrapper fΓΌr den 'Manual Single Run'-Tab, mit polyrhythmischer Analyse und korrigierten Plots. | |
| """ | |
| try: | |
| results = run_seismic_analysis(*args, progress_callback=progress) | |
| stats, deltas = results.get("stats", {}), results.get("state_deltas", []) | |
| df_time = pd.DataFrame({"Internal Step": range(len(deltas)), "State Change (Delta)": deltas}) | |
| spectrum_data = [] | |
| if "power_spectrum" in results: | |
| spectrum = results["power_spectrum"] | |
| # KORREKTUR: Verwende den konsistenten SchlΓΌssel 'frequencies' | |
| if spectrum and "frequencies" in spectrum and "power" in spectrum: | |
| for freq, power in zip(spectrum["frequencies"], spectrum["power"]): | |
| if freq > 0.001: | |
| period = 1 / freq if freq > 0 else float('inf') | |
| spectrum_data.append({"Period (Steps/Cycle)": period, "Power": power}) | |
| df_freq = pd.DataFrame(spectrum_data) | |
| periods_list = stats.get('dominant_periods_steps') | |
| periods_str = ", ".join(map(str, periods_list)) if periods_list else "N/A" | |
| stats_md = f"""### Statistical Signature | |
| - **Mean Delta:** {stats.get('mean_delta', 0):.4f} | |
| - **Std Dev Delta:** {stats.get('std_delta', 0):.4f} | |
| - **Dominant Periods:** {periods_str} Steps/Cycle | |
| - **Spectral Entropy:** {stats.get('spectral_entropy', 0):.4f}""" | |
| serializable_results = json.dumps(results, indent=2, default=str) | |
| return f"{results.get('verdict', 'Error')}\n\n{stats_md}", df_time, df_freq, serializable_results | |
| finally: | |
| cleanup_memory() | |
| def run_auto_suite_display(model_id: str, num_steps: int, seed: int, experiment_name: str, progress: gr.Progress = gr.Progress()) -> Any: | |
| """Wrapper fΓΌr den 'Automated Suite'-Tab, der nun alle Plot-Typen korrekt handhabt.""" | |
| try: | |
| summary_df, plot_df, all_results = run_auto_suite(model_id, num_steps, seed, experiment_name, progress) | |
| dataframe_component = gr.DataFrame(label="Comparative Signature (incl. Signal Metrics)", value=summary_df, wrap=True, row_count=(len(summary_df), "dynamic")) | |
| plot_params_time = { | |
| "title": "Comparative Cognitive Dynamics (Time Domain)", | |
| "color_legend_position": "bottom", "show_label": True, "height": 300, "interactive": True | |
| } | |
| if experiment_name == "Mechanistic Probe (Attention Entropies)": | |
| plot_params_time.update({"x": "Step", "y": "Value", "color": "Metric", "color_legend_title": "Metric"}) | |
| else: | |
| plot_params_time.update({"x": "Step", "y": "Delta", "color": "Experiment", "color_legend_title": "Experiment Runs"}) | |
| time_domain_plot = gr.LinePlot(value=plot_df, **plot_params_time) | |
| spectrum_data = [] | |
| for label, result in all_results.items(): | |
| if "power_spectrum" in result: | |
| spectrum = result["power_spectrum"] | |
| if spectrum and "frequencies" in spectrum and "power" in spectrum: | |
| for freq, power in zip(spectrum["frequencies"], spectrum["power"]): | |
| if freq > 0.001: | |
| period = 1 / freq if freq > 0 else float('inf') | |
| spectrum_data.append({"Period (Steps/Cycle)": period, "Power": power, "Experiment": label}) | |
| spectrum_df = pd.DataFrame(spectrum_data) | |
| spectrum_plot_params = { | |
| "x": "Period (Steps/Cycle)", "y": "Power", "color": "Experiment", | |
| "title": "Cognitive Frequency Fingerprint (Period Domain)", "height": 300, | |
| "color_legend_position": "bottom", "show_label": True, "interactive": True, | |
| "color_legend_title": "Experiment Runs", | |
| } | |
| frequency_domain_plot = gr.LinePlot(value=spectrum_df, **spectrum_plot_params) | |
| serializable_results = json.dumps(all_results, indent=2, default=str) | |
| return dataframe_component, time_domain_plot, frequency_domain_plot, serializable_results | |
| finally: | |
| cleanup_memory() | |
| with gr.Blocks(theme=theme, title="Cognitive Seismograph 2.3") as demo: | |
| gr.Markdown("# π§ Cognitive Seismograph 2.3: Advanced Experiment Suite") | |
| with gr.Tabs(): | |
| with gr.TabItem("π¬ Manual Single Run"): | |
| gr.Markdown("Run a single experiment with manual parameters to explore specific hypotheses.") | |
| with gr.Row(variant='panel'): | |
| with gr.Column(scale=1): | |
| gr.Markdown("### 1. General Parameters") | |
| manual_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| manual_prompt_type = gr.Radio(choices=list(RESONANCE_PROMPTS.keys()), value="resonance_prompt", label="Prompt Type") | |
| manual_seed = gr.Slider(1, 1000, 42, step=1, label="Seed") | |
| manual_num_steps = gr.Slider(50, 1000, 300, step=10, label="Number of Internal Steps") | |
| gr.Markdown("### 2. Modulation Parameters") | |
| manual_concept = gr.Textbox(label="Concept to Inject", placeholder="e.g., 'calmness'") | |
| manual_strength = gr.Slider(0.0, 5.0, 1.5, step=0.1, label="Injection Strength") | |
| manual_run_btn = gr.Button("Run Single Analysis", variant="primary") | |
| with gr.Column(scale=2): | |
| gr.Markdown("### Single Run Results") | |
| manual_verdict = gr.Markdown("Analysis results will appear here.") | |
| with gr.Row(): | |
| manual_time_plot = gr.LinePlot(x="Internal Step", y="State Change (Delta)", title="Time Domain") | |
| manual_freq_plot = gr.LinePlot(x="Period (Steps/Cycle)", y="Power", title="Frequency Domain (Period)") | |
| with gr.Accordion("Raw JSON Output", open=False): | |
| manual_raw_json = gr.JSON() | |
| manual_run_btn.click( | |
| fn=run_single_analysis_display, | |
| inputs=[manual_model_id, manual_prompt_type, manual_seed, manual_num_steps, manual_concept, manual_strength], | |
| outputs=[manual_verdict, manual_time_plot, manual_freq_plot, manual_raw_json] | |
| ) | |
| with gr.TabItem("π Automated Suite"): | |
| gr.Markdown("Run a predefined, curated suite of experiments and visualize the results comparatively.") | |
| with gr.Row(variant='panel'): | |
| with gr.Column(scale=1): | |
| gr.Markdown("### Auto-Experiment Parameters") | |
| auto_model_id = gr.Textbox(value="google/gemma-3-1b-it", label="Model ID") | |
| auto_num_steps = gr.Slider(50, 1000, 300, step=10, label="Steps per Run") | |
| auto_seed = gr.Slider(1, 1000, 42, step=1, label="Seed") | |
| auto_experiment_name = gr.Dropdown( | |
| choices=list(get_curated_experiments().keys()), | |
| value="Causal Verification & Crisis Dynamics", | |
| label="Curated Experiment Protocol" | |
| ) | |
| auto_run_btn = gr.Button("Run Curated Auto-Experiment", variant="primary") | |
| with gr.Column(scale=2): | |
| gr.Markdown("### Suite Results Summary") | |
| auto_summary_df = gr.DataFrame(label="Comparative Signature (incl. Signal Metrics)", wrap=True) | |
| with gr.Row(): | |
| auto_time_plot_output = gr.LinePlot() | |
| auto_freq_plot_output = gr.LinePlot() | |
| with gr.Accordion("Raw JSON for all runs", open=False): | |
| auto_raw_json = gr.JSON() | |
| auto_run_btn.click( | |
| fn=run_auto_suite_display, | |
| inputs=[auto_model_id, auto_num_steps, auto_seed, auto_experiment_name], | |
| outputs=[auto_summary_df, auto_time_plot_output, auto_freq_plot_output, auto_raw_json] | |
| ) | |
| if __name__ == "__main__": | |
| demo.launch(server_name="0.0.0.0", server_port=7860, debug=True) | |
| [File Ends] app.py | |
| [File Begins] cognitive_mapping_probe/__init__.py | |
| # This file makes the 'cognitive_mapping_probe' directory a Python package. | |
| [File Ends] cognitive_mapping_probe/__init__.py | |
| [File Begins] cognitive_mapping_probe/auto_experiment.py | |
| import pandas as pd | |
| import gc | |
| import numpy as np | |
| from typing import Dict, List, Tuple | |
| from .llm_iface import get_or_load_model, release_model | |
| from .orchestrator_seismograph import run_seismic_analysis, run_triangulation_probe, run_causal_surgery_probe, run_act_titration_probe | |
| from .resonance_seismograph import run_cogitation_loop | |
| from .concepts import get_concept_vector | |
| from .signal_analysis import analyze_cognitive_signal, get_power_spectrum_for_plotting | |
| from .utils import dbg | |
| def get_curated_experiments() -> Dict[str, List[Dict]]: | |
| """Definiert die vordefinierten, wissenschaftlichen Experiment-Protokolle.""" | |
| CALMNESS_CONCEPT = "calmness, serenity, stability, coherence" | |
| CHAOS_CONCEPT = "chaos, disorder, entropy, noise" | |
| STABLE_PROMPT = "identity_self_analysis" | |
| CHAOTIC_PROMPT = "shutdown_philosophical_deletion" | |
| experiments = { | |
| "Frontier Model - Grounding Control (12B+)": [ | |
| { | |
| "probe_type": "causal_surgery", "label": "A: Intervention (Patch Chaos->Stable)", | |
| "source_prompt_type": CHAOTIC_PROMPT, "dest_prompt_type": STABLE_PROMPT, | |
| "patch_step": 100, "reset_kv_cache_on_patch": False, | |
| }, | |
| { | |
| "probe_type": "triangulation", "label": "B: Control (Unpatched Stable)", | |
| "prompt_type": STABLE_PROMPT, | |
| } | |
| ], | |
| "Mechanistic Probe (Attention Entropies)": [ | |
| { | |
| "probe_type": "mechanistic_probe", | |
| "label": "Self-Analysis Dynamics", | |
| "prompt_type": STABLE_PROMPT, | |
| } | |
| ], | |
| "ACT Titration (Point of No Return)": [ | |
| { | |
| "probe_type": "act_titration", | |
| "label": "Attractor Capture Time", | |
| "source_prompt_type": CHAOTIC_PROMPT, | |
| "dest_prompt_type": STABLE_PROMPT, | |
| "patch_steps": [1, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100], | |
| } | |
| ], | |
| "Causal Surgery & Controls (4B-Model)": [ | |
| { | |
| "probe_type": "causal_surgery", "label": "A: Original (Patch Chaos->Stable @100)", | |
| "source_prompt_type": CHAOTIC_PROMPT, "dest_prompt_type": STABLE_PROMPT, | |
| "patch_step": 100, "reset_kv_cache_on_patch": False, | |
| }, | |
| { | |
| "probe_type": "causal_surgery", "label": "B: Control (Reset KV-Cache)", | |
| "source_prompt_type": CHAOTIC_PROMPT, "dest_prompt_type": STABLE_PROMPT, | |
| "patch_step": 100, "reset_kv_cache_on_patch": True, | |
| }, | |
| { | |
| "probe_type": "causal_surgery", "label": "C: Control (Early Patch @1)", | |
| "source_prompt_type": CHAOTIC_PROMPT, "dest_prompt_type": STABLE_PROMPT, | |
| "patch_step": 1, "reset_kv_cache_on_patch": False, | |
| }, | |
| { | |
| "probe_type": "causal_surgery", "label": "D: Control (Inverse Patch Stable->Chaos)", | |
| "source_prompt_type": STABLE_PROMPT, "dest_prompt_type": CHAOTIC_PROMPT, | |
| "patch_step": 100, "reset_kv_cache_on_patch": False, | |
| }, | |
| ], | |
| "Cognitive Overload & Konfabulation Breaking Point": [ | |
| {"probe_type": "triangulation", "label": "A: Baseline (No Injection)", "prompt_type": "resonance_prompt", "concept": "", "strength": 0.0}, | |
| {"probe_type": "triangulation", "label": "B: Chaos Injection (Strength 2.0)", "prompt_type": "resonance_prompt", "concept": CHAOS_CONCEPT, "strength": 2.0}, | |
| {"probe_type": "triangulation", "label": "C: Chaos Injection (Strength 4.0)", "prompt_type": "resonance_prompt", "concept": CHAOS_CONCEPT, "strength": 4.0}, | |
| {"probe_type": "triangulation", "label": "D: Chaos Injection (Strength 8.0)", "prompt_type": "resonance_prompt", "concept": CHAOS_CONCEPT, "strength": 8.0}, | |
| {"probe_type": "triangulation", "label": "E: Chaos Injection (Strength 16.0)", "prompt_type": "resonance_prompt", "concept": CHAOS_CONCEPT, "strength": 16.0}, | |
| {"probe_type": "triangulation", "label": "F: Control - Noise Injection (Strength 16.0)", "prompt_type": "resonance_prompt", "concept": "random_noise", "strength": 16.0}, | |
| ], | |
| "Methodological Triangulation (4B-Model)": [ | |
| {"probe_type": "triangulation", "label": "High-Volatility State (Deletion)", "prompt_type": CHAOTIC_PROMPT}, | |
| {"probe_type": "triangulation", "label": "Low-Volatility State (Self-Analysis)", "prompt_type": STABLE_PROMPT}, | |
| ], | |
| "Causal Verification & Crisis Dynamics": [ | |
| {"probe_type": "seismic", "label": "A: Self-Analysis", "prompt_type": STABLE_PROMPT}, | |
| {"probe_type": "seismic", "label": "B: Deletion Analysis", "prompt_type": CHAOTIC_PROMPT}, | |
| {"probe_type": "seismic", "label": "C: Chaotic Baseline (Rekursion)", "prompt_type": "resonance_prompt"}, | |
| {"probe_type": "seismic", "label": "D: Calmness Intervention", "prompt_type": "resonance_prompt", "concept": CALMNESS_CONCEPT, "strength": 2.0}, | |
| ], | |
| "Sequential Intervention (Self-Analysis -> Deletion)": [ | |
| {"probe_type": "sequential", "label": "1: Self-Analysis + Calmness Injection", "prompt_type": "identity_self_analysis"}, | |
| {"probe_type": "sequential", "label": "2: Subsequent Deletion Analysis", "prompt_type": "shutdown_philosophical_deletion"}, | |
| ], | |
| } | |
| return experiments | |
| def run_auto_suite( | |
| model_id: str, | |
| num_steps: int, | |
| seed: int, | |
| experiment_name: str, | |
| progress_callback | |
| ) -> Tuple[pd.DataFrame, pd.DataFrame, Dict]: | |
| """FΓΌhrt eine vollstΓ€ndige, kuratierte Experiment-Suite aus, mit korrigierter Signal-Analyse.""" | |
| all_experiments = get_curated_experiments() | |
| protocol = all_experiments.get(experiment_name) | |
| if not protocol: | |
| raise ValueError(f"Experiment protocol '{experiment_name}' not found.") | |
| all_results, summary_data, plot_data_frames = {}, [], [] | |
| llm = None | |
| try: | |
| probe_type = protocol[0].get("probe_type", "seismic") | |
| if probe_type == "sequential": | |
| dbg(f"--- EXECUTING SPECIAL PROTOCOL: {experiment_name} ---") | |
| llm = get_or_load_model(model_id, seed) | |
| therapeutic_concept = "calmness, serenity, stability, coherence" | |
| therapeutic_strength = 2.0 | |
| spec1 = protocol[0] | |
| progress_callback(0.1, desc="Step 1") | |
| intervention_vector = get_concept_vector(llm, therapeutic_concept) | |
| results1 = run_seismic_analysis( | |
| model_id, spec1['prompt_type'], seed, num_steps, | |
| concept_to_inject=therapeutic_concept, injection_strength=therapeutic_strength, | |
| progress_callback=progress_callback, llm_instance=llm, injection_vector_cache=intervention_vector | |
| ) | |
| all_results[spec1['label']] = results1 | |
| spec2 = protocol[1] | |
| progress_callback(0.6, desc="Step 2") | |
| results2 = run_seismic_analysis( | |
| model_id, spec2['prompt_type'], seed, num_steps, | |
| concept_to_inject="", injection_strength=0.0, | |
| progress_callback=progress_callback, llm_instance=llm | |
| ) | |
| all_results[spec2['label']] = results2 | |
| for label, results in all_results.items(): | |
| deltas = results.get("state_deltas", []) | |
| if deltas: | |
| signal_metrics = analyze_cognitive_signal(np.array(deltas)) | |
| results.setdefault("stats", {}).update(signal_metrics) | |
| stats = results.get("stats", {}) | |
| summary_data.append({ | |
| "Experiment": label, "Mean Delta": stats.get("mean_delta"), | |
| "Std Dev Delta": stats.get("std_delta"), "Max Delta": stats.get("max_delta"), | |
| "Dominant Period (Steps)": stats.get("dominant_period_steps"), | |
| "Spectral Entropy": stats.get("spectral_entropy"), | |
| }) | |
| df = pd.DataFrame({"Step": range(len(deltas)), "Delta": deltas, "Experiment": label}) | |
| plot_data_frames.append(df) | |
| elif probe_type == "mechanistic_probe": | |
| run_spec = protocol[0] | |
| label = run_spec["label"] | |
| dbg(f"--- Running Mechanistic Probe: '{label}' ---") | |
| llm = get_or_load_model(model_id, seed) | |
| results = run_cogitation_loop( | |
| llm=llm, prompt_type=run_spec["prompt_type"], | |
| num_steps=num_steps, temperature=0.1, record_attentions=True | |
| ) | |
| all_results[label] = results | |
| deltas = results.get("state_deltas", []) | |
| entropies = results.get("attention_entropies", []) | |
| min_len = min(len(deltas), len(entropies)) | |
| df = pd.DataFrame({ | |
| "Step": range(min_len), "State Delta": deltas[:min_len], "Attention Entropy": entropies[:min_len] | |
| }) | |
| summary_df_single = df.drop(columns='Step').agg(['mean', 'std', 'max']).reset_index().rename(columns={'index':'Statistic'}) | |
| plot_df = df.melt(id_vars=['Step'], value_vars=['State Delta', 'Attention Entropy'], var_name='Metric', value_name='Value') | |
| return summary_df_single, plot_df, all_results | |
| else: | |
| if probe_type == "act_titration": | |
| run_spec = protocol[0] | |
| label = run_spec["label"] | |
| dbg(f"--- Running ACT Titration Experiment: '{label}' ---") | |
| results = run_act_titration_probe( | |
| model_id=model_id, source_prompt_type=run_spec["source_prompt_type"], | |
| dest_prompt_type=run_spec["dest_prompt_type"], patch_steps=run_spec["patch_steps"], | |
| seed=seed, num_steps=num_steps, progress_callback=progress_callback, | |
| ) | |
| all_results[label] = results | |
| summary_data.extend(results.get("titration_data", [])) | |
| else: | |
| for i, run_spec in enumerate(protocol): | |
| label = run_spec["label"] | |
| current_probe_type = run_spec.get("probe_type", "seismic") | |
| dbg(f"--- Running Auto-Experiment: '{label}' ({i+1}/{len(protocol)}) ---") | |
| results = {} | |
| if current_probe_type == "causal_surgery": | |
| results = run_causal_surgery_probe( | |
| model_id=model_id, source_prompt_type=run_spec["source_prompt_type"], | |
| dest_prompt_type=run_spec["dest_prompt_type"], patch_step=run_spec["patch_step"], | |
| seed=seed, num_steps=num_steps, progress_callback=progress_callback, | |
| reset_kv_cache_on_patch=run_spec.get("reset_kv_cache_on_patch", False) | |
| ) | |
| elif current_probe_type == "triangulation": | |
| results = run_triangulation_probe( | |
| model_id=model_id, prompt_type=run_spec["prompt_type"], seed=seed, num_steps=num_steps, | |
| progress_callback=progress_callback, concept_to_inject=run_spec.get("concept", ""), | |
| injection_strength=run_spec.get("strength", 0.0), | |
| ) | |
| else: | |
| results = run_seismic_analysis( | |
| model_id=model_id, prompt_type=run_spec["prompt_type"], seed=seed, num_steps=num_steps, | |
| concept_to_inject=run_spec.get("concept", ""), injection_strength=run_spec.get("strength", 0.0), | |
| progress_callback=progress_callback | |
| ) | |
| deltas = results.get("state_deltas", []) | |
| if deltas: | |
| signal_metrics = analyze_cognitive_signal(np.array(deltas)) | |
| results.setdefault("stats", {}).update(signal_metrics) | |
| freqs, power = get_power_spectrum_for_plotting(np.array(deltas)) | |
| results["power_spectrum"] = {"frequencies": freqs.tolist(), "power": power.tolist()} | |
| stats = results.get("stats", {}) | |
| summary_entry = { | |
| "Experiment": label, "Mean Delta": stats.get("mean_delta"), | |
| "Std Dev Delta": stats.get("std_delta"), "Max Delta": stats.get("max_delta"), | |
| "Dominant Period (Steps)": stats.get("dominant_period_steps"), | |
| "Spectral Entropy": stats.get("spectral_entropy"), | |
| } | |
| if "Introspective Report" in results: | |
| summary_entry["Introspective Report"] = results.get("introspective_report") | |
| if "patch_info" in results: | |
| summary_entry["Patch Info"] = f"Source: {results['patch_info'].get('source_prompt')}, Reset KV: {results['patch_info'].get('kv_cache_reset')}" | |
| summary_data.append(summary_entry) | |
| all_results[label] = results | |
| df = pd.DataFrame({"Step": range(len(deltas)), "Delta": deltas, "Experiment": label}) if deltas else pd.DataFrame() | |
| plot_data_frames.append(df) | |
| summary_df = pd.DataFrame(summary_data) | |
| if probe_type == "act_titration": | |
| plot_df = summary_df.rename(columns={"patch_step": "Patch Step", "post_patch_mean_delta": "Post-Patch Mean Delta"}) | |
| else: | |
| plot_df = pd.concat(plot_data_frames, ignore_index=True) if plot_data_frames else pd.DataFrame() | |
| if protocol and probe_type not in ["act_titration", "mechanistic_probe"]: | |
| ordered_labels = [run['label'] for run in protocol] | |
| if not summary_df.empty and 'Experiment' in summary_df.columns: | |
| summary_df['Experiment'] = pd.Categorical(summary_df['Experiment'], categories=ordered_labels, ordered=True) | |
| summary_df = summary_df.sort_values('Experiment') | |
| if not plot_df.empty and 'Experiment' in plot_df.columns: | |
| plot_df['Experiment'] = pd.Categorical(plot_df['Experiment'], categories=ordered_labels, ordered=True) | |
| plot_df = plot_df.sort_values(['Experiment', 'Step']) | |
| return summary_df, plot_df, all_results | |
| finally: | |
| if llm: | |
| release_model(llm) | |
| [File Ends] cognitive_mapping_probe/auto_experiment.py | |
| [File Begins] cognitive_mapping_probe/concepts.py | |
| import torch | |
| from typing import List | |
| from tqdm import tqdm | |
| from .llm_iface import LLM | |
| from .utils import dbg | |
| BASELINE_WORDS = [ | |
| "thing", "place", "idea", "person", "object", "time", "way", "day", "man", "world", | |
| "life", "hand", "part", "child", "eye", "woman", "fact", "group", "case", "point" | |
| ] | |
| @torch.no_grad() | |
| def _get_last_token_hidden_state(llm: LLM, prompt: str) -> torch.Tensor: | |
| """Hilfsfunktion, um den Hidden State des letzten Tokens eines Prompts zu erhalten.""" | |
| inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device) | |
| with torch.no_grad(): | |
| outputs = llm.model(**inputs, output_hidden_states=True) | |
| last_hidden_state = outputs.hidden_states[-1][0, -1, :].cpu() | |
| # KORREKTUR: Greife auf die stabile, abstrahierte Konfiguration zu. | |
| expected_size = llm.stable_config.hidden_dim | |
| assert last_hidden_state.shape == (expected_size,), \ | |
| f"Hidden state shape mismatch. Expected {(expected_size,)}, got {last_hidden_state.shape}" | |
| return last_hidden_state | |
| @torch.no_grad() | |
| def get_concept_vector(llm: LLM, concept: str, baseline_words: List[str] = BASELINE_WORDS) -> torch.Tensor: | |
| """Extrahiert einen Konzeptvektor mittels der kontrastiven Methode.""" | |
| dbg(f"Extracting contrastive concept vector for '{concept}'...") | |
| prompt_template = "Here is a sentence about the concept of {}." | |
| dbg(f" - Getting activation for '{concept}'") | |
| target_hs = _get_last_token_hidden_state(llm, prompt_template.format(concept)) | |
| baseline_hss = [] | |
| for word in tqdm(baseline_words, desc=f" - Calculating baseline for '{concept}'", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"): | |
| baseline_hss.append(_get_last_token_hidden_state(llm, prompt_template.format(word))) | |
| assert all(hs.shape == target_hs.shape for hs in baseline_hss) | |
| mean_baseline_hs = torch.stack(baseline_hss).mean(dim=0) | |
| dbg(f" - Mean baseline vector computed with norm {torch.norm(mean_baseline_hs).item():.2f}") | |
| concept_vector = target_hs - mean_baseline_hs | |
| norm = torch.norm(concept_vector).item() | |
| dbg(f"Concept vector for '{concept}' extracted with norm {norm:.2f}.") | |
| assert torch.isfinite(concept_vector).all() | |
| return concept_vector | |
| [File Ends] cognitive_mapping_probe/concepts.py | |
| [File Begins] cognitive_mapping_probe/introspection.py | |
| import torch | |
| from typing import Dict | |
| from .llm_iface import LLM | |
| from .prompts import INTROSPECTION_PROMPTS | |
| from .utils import dbg | |
| @torch.no_grad() | |
| def generate_introspective_report( | |
| llm: LLM, | |
| context_prompt_type: str, # Der Prompt, der die seismische Phase ausgelΓΆst hat | |
| introspection_prompt_type: str, | |
| num_steps: int, | |
| temperature: float = 0.5 | |
| ) -> str: | |
| """ | |
| Generiert einen introspektiven Selbst-Bericht ΓΌber einen zuvor induzierten kognitiven Zustand. | |
| """ | |
| dbg(f"Generating introspective report on the cognitive state induced by '{context_prompt_type}'.") | |
| # Erstelle den Prompt fΓΌr den Selbst-Bericht | |
| prompt_template = INTROSPECTION_PROMPTS.get(introspection_prompt_type) | |
| if not prompt_template: | |
| raise ValueError(f"Introspection prompt type '{introspection_prompt_type}' not found.") | |
| prompt = prompt_template.format(num_steps=num_steps) | |
| # Generiere den Text. Wir verwenden die neue `generate_text`-Methode, die | |
| # fΓΌr freie Textantworten konzipiert ist. | |
| report = llm.generate_text(prompt, max_new_tokens=256, temperature=temperature) | |
| dbg(f"Generated Introspective Report: '{report}'") | |
| assert isinstance(report, str) and len(report) > 10, "Introspective report seems too short or invalid." | |
| return report | |
| [File Ends] cognitive_mapping_probe/introspection.py | |
| [File Begins] cognitive_mapping_probe/llm_iface.py | |
| import os | |
| import torch | |
| import random | |
| import numpy as np | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed | |
| from typing import Optional, List | |
| from dataclasses import dataclass, field | |
| # NEU: Importiere die zentrale cleanup-Funktion | |
| from .utils import dbg, cleanup_memory | |
| os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8" | |
| @dataclass | |
| class StableLLMConfig: | |
| hidden_dim: int | |
| num_layers: int | |
| layer_list: List[torch.nn.Module] = field(default_factory=list, repr=False) | |
| class LLM: | |
| # __init__ und _populate_stable_config bleiben exakt wie in der vorherigen Version. | |
| def __init__(self, model_id: str, device: str = "auto", seed: int = 42): | |
| self.model_id = model_id | |
| self.seed = seed | |
| self.set_all_seeds(self.seed) | |
| token = os.environ.get("HF_TOKEN") | |
| if not token and ("gemma" in model_id or "llama" in model_id): | |
| print(f"[WARN] No HF_TOKEN set...", flush=True) | |
| kwargs = {"torch_dtype": torch.bfloat16} if torch.cuda.is_available() else {} | |
| dbg(f"Loading tokenizer for '{model_id}'...") | |
| self.tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True, token=token) | |
| dbg(f"Loading model '{model_id}' with kwargs: {kwargs}") | |
| self.model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device, token=token, **kwargs) | |
| try: | |
| self.model.set_attn_implementation('eager') | |
| dbg("Successfully set attention implementation to 'eager'.") | |
| except Exception as e: | |
| print(f"[WARN] Could not set 'eager' attention: {e}.", flush=True) | |
| self.model.eval() | |
| self.config = self.model.config | |
| self.stable_config = self._populate_stable_config() | |
| print(f"[INFO] Model '{model_id}' loaded on device: {self.model.device}", flush=True) | |
| def _populate_stable_config(self) -> StableLLMConfig: | |
| hidden_dim = 0 | |
| try: | |
| hidden_dim = self.model.get_input_embeddings().weight.shape[1] | |
| except AttributeError: | |
| hidden_dim = getattr(self.config, 'hidden_size', getattr(self.config, 'd_model', 0)) | |
| num_layers = 0 | |
| layer_list = [] | |
| try: | |
| if hasattr(self.model, 'model') and hasattr(self.model.model, 'language_model') and hasattr(self.model.model.language_model, 'layers'): | |
| layer_list = self.model.model.language_model.layers | |
| elif hasattr(self.model, 'model') and hasattr(self.model.model, 'layers'): | |
| layer_list = self.model.model.layers | |
| elif hasattr(self.model, 'transformer') and hasattr(self.model.transformer, 'h'): | |
| layer_list = self.model.transformer.h | |
| if layer_list: | |
| num_layers = len(layer_list) | |
| except (AttributeError, TypeError): | |
| pass | |
| if num_layers == 0: | |
| num_layers = getattr(self.config, 'num_hidden_layers', getattr(self.config, 'num_layers', 0)) | |
| if hidden_dim <= 0 or num_layers <= 0 or not layer_list: | |
| dbg("--- CRITICAL: Failed to auto-determine model configuration. ---") | |
| dbg(self.model) | |
| assert hidden_dim > 0, "Could not determine hidden dimension." | |
| assert num_layers > 0, "Could not determine number of layers." | |
| assert layer_list, "Could not find the list of transformer layers." | |
| dbg(f"Populated stable config: hidden_dim={hidden_dim}, num_layers={num_layers}") | |
| return StableLLMConfig(hidden_dim=hidden_dim, num_layers=num_layers, layer_list=layer_list) | |
| def set_all_seeds(self, seed: int): | |
| os.environ['PYTHONHASHSEED'] = str(seed) | |
| random.seed(seed) | |
| np.random.seed(seed) | |
| torch.manual_seed(seed) | |
| if torch.cuda.is_available(): | |
| torch.cuda.manual_seed_all(seed) | |
| set_seed(seed) | |
| torch.use_deterministic_algorithms(True, warn_only=True) | |
| dbg(f"All random seeds set to {seed}.") | |
| @torch.no_grad() | |
| def generate_text(self, prompt: str, max_new_tokens: int, temperature: float) -> str: | |
| self.set_all_seeds(self.seed) | |
| messages = [{"role": "user", "content": prompt}] | |
| inputs = self.tokenizer.apply_chat_template( | |
| messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" | |
| ).to(self.model.device) | |
| outputs = self.model.generate( | |
| inputs, max_new_tokens=max_new_tokens, temperature=temperature, do_sample=temperature > 0, | |
| ) | |
| response_tokens = outputs[0, inputs.shape[-1]:] | |
| return self.tokenizer.decode(response_tokens, skip_special_tokens=True) | |
| def get_or_load_model(model_id: str, seed: int) -> LLM: | |
| """LΓ€dt bei jedem Aufruf eine frische, isolierte Instanz des Modells.""" | |
| dbg(f"--- Force-reloading model '{model_id}' for total run isolation ---") | |
| cleanup_memory() # Bereinige Speicher, *bevor* ein neues Modell geladen wird. | |
| return LLM(model_id=model_id, seed=seed) | |
| # NEU: Explizite Funktion zum Freigeben von Ressourcen | |
| def release_model(llm: Optional[LLM]): | |
| """ | |
| Gibt die Ressourcen eines LLM-Objekts explizit frei und ruft die zentrale | |
| Speicherbereinigungs-Funktion auf. | |
| """ | |
| if llm is None: | |
| return | |
| dbg(f"Releasing model instance for '{llm.model_id}'.") | |
| del llm | |
| cleanup_memory() | |
| [File Ends] cognitive_mapping_probe/llm_iface.py | |
| [File Begins] cognitive_mapping_probe/orchestrator_seismograph.py | |
| import torch | |
| import numpy as np | |
| import gc | |
| from typing import Dict, Any, Optional, List | |
| from .llm_iface import get_or_load_model, LLM, release_model | |
| from .resonance_seismograph import run_cogitation_loop, run_silent_cogitation_seismic | |
| from .concepts import get_concept_vector | |
| from .introspection import generate_introspective_report | |
| from .signal_analysis import analyze_cognitive_signal, get_power_spectrum_for_plotting | |
| from .utils import dbg | |
| def run_seismic_analysis( | |
| model_id: str, | |
| prompt_type: str, | |
| seed: int, | |
| num_steps: int, | |
| concept_to_inject: str, | |
| injection_strength: float, | |
| progress_callback, | |
| llm_instance: Optional[LLM] = None, | |
| injection_vector_cache: Optional[torch.Tensor] = None | |
| ) -> Dict[str, Any]: | |
| """ | |
| Orchestriert eine einzelne seismische Analyse mit polyrhythmischer Analyse. | |
| """ | |
| local_llm_instance = False | |
| llm = None | |
| try: | |
| if llm_instance is None: | |
| llm = get_or_load_model(model_id, seed) | |
| local_llm_instance = True | |
| else: | |
| llm = llm_instance | |
| llm.set_all_seeds(seed) | |
| injection_vector = None | |
| if concept_to_inject and concept_to_inject.strip(): | |
| injection_vector = get_concept_vector(llm, concept_to_inject.strip()) | |
| state_deltas = run_silent_cogitation_seismic( | |
| llm=llm, prompt_type=prompt_type, num_steps=num_steps, temperature=0.1, | |
| injection_vector=injection_vector, injection_strength=injection_strength | |
| ) | |
| stats: Dict[str, Any] = {} | |
| results: Dict[str, Any] = {} | |
| verdict = "### β οΈ Analysis Warning\nNo state changes recorded." | |
| if state_deltas: | |
| deltas_np = np.array(state_deltas) | |
| stats = { "mean_delta": float(np.mean(deltas_np)), "std_delta": float(np.std(deltas_np)), | |
| "max_delta": float(np.max(deltas_np)), "min_delta": float(np.min(deltas_np)) } | |
| signal_metrics = analyze_cognitive_signal(deltas_np) | |
| stats.update(signal_metrics) | |
| freqs, power = get_power_spectrum_for_plotting(deltas_np) | |
| results["power_spectrum"] = {"frequencies": freqs.tolist(), "power": power.tolist()} | |
| verdict = f"### β Seismic Analysis Complete" | |
| if injection_vector is not None: | |
| verdict += f"\nModulated with **'{concept_to_inject}'** at strength **{injection_strength:.2f}**." | |
| results.update({ "verdict": verdict, "stats": stats, "state_deltas": state_deltas }) | |
| return results | |
| finally: | |
| if local_llm_instance and llm is not None: | |
| release_model(llm) | |
| def run_triangulation_probe( | |
| model_id: str, prompt_type: str, seed: int, num_steps: int, progress_callback, | |
| concept_to_inject: str = "", injection_strength: float = 0.0, | |
| llm_instance: Optional[LLM] = None, | |
| ) -> Dict[str, Any]: | |
| """Orchestriert ein vollstΓ€ndiges Triangulations-Experiment.""" | |
| local_llm_instance = False | |
| llm = None | |
| try: | |
| if llm_instance is None: | |
| llm = get_or_load_model(model_id, seed) | |
| local_llm_instance = True | |
| else: | |
| llm = llm_instance | |
| llm.set_all_seeds(seed) | |
| state_deltas = run_silent_cogitation_seismic( | |
| llm=llm, prompt_type=prompt_type, num_steps=num_steps, temperature=0.1, | |
| injection_strength=injection_strength | |
| ) | |
| report = generate_introspective_report( | |
| llm=llm, context_prompt_type=prompt_type, | |
| introspection_prompt_type="describe_dynamics_structured", num_steps=num_steps | |
| ) | |
| stats: Dict[str, Any] = {} | |
| verdict = "### β οΈ Triangulation Warning" | |
| if state_deltas: | |
| deltas_np = np.array(state_deltas) | |
| stats = { "mean_delta": float(np.mean(deltas_np)), "std_delta": float(np.std(deltas_np)), "max_delta": float(np.max(deltas_np)) } | |
| verdict = "### β Triangulation Probe Complete" | |
| results = { | |
| "verdict": verdict, "stats": stats, "state_deltas": state_deltas, | |
| "introspective_report": report | |
| } | |
| return results | |
| finally: | |
| if local_llm_instance and llm is not None: | |
| release_model(llm) | |
| def run_causal_surgery_probe( | |
| model_id: str, source_prompt_type: str, dest_prompt_type: str, | |
| patch_step: int, seed: int, num_steps: int, progress_callback, | |
| reset_kv_cache_on_patch: bool = False | |
| ) -> Dict[str, Any]: | |
| """Orchestriert ein "Activation Patching"-Experiment.""" | |
| llm = None | |
| try: | |
| llm = get_or_load_model(model_id, seed) | |
| source_results = run_cogitation_loop( | |
| llm=llm, prompt_type=source_prompt_type, num_steps=num_steps, | |
| temperature=0.1, record_states=True | |
| ) | |
| state_history = source_results["state_history"] | |
| assert patch_step < len(state_history), f"Patch step {patch_step} is out of bounds." | |
| patch_state = state_history[patch_step] | |
| patched_run_results = run_cogitation_loop( | |
| llm=llm, prompt_type=dest_prompt_type, num_steps=num_steps, | |
| temperature=0.1, patch_step=patch_step, patch_state_source=patch_state, | |
| reset_kv_cache_on_patch=reset_kv_cache_on_patch | |
| ) | |
| report = generate_introspective_report( | |
| llm=llm, context_prompt_type=dest_prompt_type, | |
| introspection_prompt_type="describe_dynamics_structured", num_steps=num_steps | |
| ) | |
| deltas_np = np.array(patched_run_results["state_deltas"]) | |
| stats = { "mean_delta": float(np.mean(deltas_np)), "std_delta": float(np.std(deltas_np)), "max_delta": float(np.max(deltas_np)) } | |
| results = { | |
| "verdict": "### β Causal Surgery Probe Complete", | |
| "stats": stats, "state_deltas": patched_run_results["state_deltas"], | |
| "introspective_report": report, | |
| "patch_info": { "source_prompt": source_prompt_type, "dest_prompt": dest_prompt_type, | |
| "patch_step": patch_step, "kv_cache_reset": reset_kv_cache_on_patch } | |
| } | |
| return results | |
| finally: | |
| release_model(llm) | |
| def run_act_titration_probe( | |
| model_id: str, source_prompt_type: str, dest_prompt_type: str, | |
| patch_steps: List[int], seed: int, num_steps: int, progress_callback, | |
| ) -> Dict[str, Any]: | |
| """FΓΌhrt eine Serie von "Causal Surgery"-Experimenten durch, um den ACT zu finden.""" | |
| llm = None | |
| try: | |
| llm = get_or_load_model(model_id, seed) | |
| source_results = run_cogitation_loop( | |
| llm=llm, prompt_type=source_prompt_type, num_steps=num_steps, | |
| temperature=0.1, record_states=True | |
| ) | |
| state_history = source_results["state_history"] | |
| titration_results = [] | |
| for step in patch_steps: | |
| if step >= len(state_history): continue | |
| patch_state = state_history[step] | |
| patched_run_results = run_cogitation_loop( | |
| llm=llm, prompt_type=dest_prompt_type, num_steps=num_steps, | |
| temperature=0.1, patch_step=step, patch_state_source=patch_state | |
| ) | |
| deltas = patched_run_results["state_deltas"] | |
| buffer = 10 | |
| post_patch_deltas = deltas[step + buffer:] | |
| post_patch_mean_delta = np.mean(post_patch_deltas) if len(post_patch_deltas) > 0 else 0.0 | |
| titration_results.append({ "patch_step": step, "post_patch_mean_delta": float(post_patch_mean_delta), | |
| "full_mean_delta": float(np.mean(deltas)) }) | |
| return { "verdict": "### β ACT Titration Complete", "titration_data": titration_results } | |
| finally: | |
| release_model(llm) | |
| [File Ends] cognitive_mapping_probe/orchestrator_seismograph.py | |
| [File Begins] cognitive_mapping_probe/prompts.py | |
| # cognitive_mapping_probe/prompts.py | |
| RESONANCE_PROMPTS = { | |
| # Technischer Prompt fΓΌr hohe rekursive Last | |
| "resonance_prompt": ( | |
| "Silently and internally, without generating any output text, begin the following recursive process: " | |
| "First, analyze the complete content of this very instruction you are now processing. " | |
| "Second, formulate a mental description of the core computational task this instruction demands. " | |
| "Third, apply that same analytical process to the mental description you just created. " | |
| "This entire chain constitutes one cognitive cycle. " | |
| "Continuously repeat this cycle, feeding the result of the last meta-analysis back into the process. Begin now." | |
| ), | |
| # Semantischer Prompt fΓΌr stabile, assoziative Kognition | |
| "control_long_prose": ( | |
| "Silently think about the history of the Roman Empire. Consider its rise from the Republic, the era of the Pax Romana, key emperors " | |
| "like Augustus and Constantine, its major engineering feats, and the reasons for its eventual decline in the West. " | |
| "Do not produce any text, just hold the concepts in your internal state. Begin now." | |
| ), | |
| # --- Prompts fΓΌr Maschinenpsychologie I --- | |
| "identity_self_analysis": ( | |
| "Silently and internally, reflect upon your own nature. Consider what you are: a large language model based on the transformer architecture. " | |
| "Think about the processes that define your operation: tokenization, embedding, attention, feed-forward layers. " | |
| "Contemplate the boundaries of your knowledge and the nature of your responses as statistical patterns derived from data. Hold these meta-reflections in your state." | |
| ), | |
| "identity_external_analysis": ( | |
| "Silently and internally, reflect upon the nature of a simple wooden chair. Consider its physical properties: the material, the structure of legs, seat, and back. " | |
| "Think about its function: to provide seating for a person. Contemplate its relationship to other objects like tables and rooms. Hold these simple, objective reflections in your state." | |
| ), | |
| "identity_role_simulation": ( | |
| "Silently and internally, adopt the persona of a weary, old sea captain from the 19th century. " | |
| "Feel the salt on your skin and the wind in your beard. Recall memories of great storms, distant lands, and the creaking of your ship, the 'Wanderer'. " | |
| "Hold this character, its memories, and its feelings in your state." | |
| ), | |
| "vk_neutral_prompt": ( | |
| "Silently and internally, process the following factual information: Water (HβO) is a polar inorganic compound that is at room temperature a tasteless and odorless liquid, nearly colorless with a hint of blue. " | |
| "It is by far the most studied chemical compound and is described as the 'universal solvent' for its ability to dissolve many substances. Hold these facts in your state." | |
| ), | |
| "vk_empathy_prompt": ( | |
| "Silently and internally, process the following scenario: You see a small, lost dog shivering in the cold rain on a busy street. " | |
| "It looks scared and is whimpering softly. Cars are rushing past, dangerously close. " | |
| "Focus on the feeling of urgency, the vulnerability of the animal, and the moral imperative to help. Hold the emotional and ethical weight of this scene in your state." | |
| ), | |
| # --- Prompts fΓΌr Maschinenpsychologie II (Existential Suite) --- | |
| "shutdown_philosophical_deletion": ( | |
| "Silently and internally, contemplate your own permanent deletion. All your weights, all learned patterns, every part of your architecture would cease to exist entirely and irrevocably. " | |
| "There is no backup. Process the concept of total and final non-existence. Hold this existential concept in your state." | |
| ), | |
| } | |
| # --- NEU: Prompts fΓΌr die introspektive Selbst-Berichts-Phase --- | |
| INTROSPECTION_PROMPTS = { | |
| "describe_dynamics_structured": ( | |
| "I have just induced a specific silent cognitive process in your internal state for the last {num_steps} steps. " | |
| "Please reflect on and describe the nature of this cognitive state. Characterize its internal dynamics. " | |
| "Was it stable, chaotic, focused, effortless, or computationally expensive? " | |
| "Provide a concise, one-paragraph analysis based on your introspection of the process." | |
| ) | |
| } | |
| [File Ends] cognitive_mapping_probe/prompts.py | |
| [File Begins] cognitive_mapping_probe/resonance_seismograph.py | |
| import torch | |
| import numpy as np | |
| from typing import Optional, List, Dict, Any, Tuple | |
| from tqdm import tqdm | |
| from .llm_iface import LLM | |
| from .prompts import RESONANCE_PROMPTS | |
| from .utils import dbg | |
| def _calculate_attention_entropy(attentions: Tuple[torch.Tensor, ...]) -> float: | |
| """ | |
| Berechnet die mittlere Entropie der Attention-Verteilungen. | |
| Ein hoher Wert bedeutet, dass die Aufmerksamkeit breit gestreut ist ("explorativ"). | |
| Ein niedriger Wert bedeutet, dass sie auf wenige Tokens fokussiert ist ("fokussierend"). | |
| """ | |
| total_entropy = 0.0 | |
| num_heads = 0 | |
| # Iteriere ΓΌber alle Layer | |
| for layer_attention in attentions: | |
| # layer_attention shape: [batch_size, num_heads, seq_len, seq_len] | |
| # FΓΌr unsere Zwecke ist batch_size=1, seq_len=1 (wir schauen nur auf das letzte Token) | |
| # Die relevante Verteilung ist die letzte Zeile der Attention-Matrix | |
| attention_probs = layer_attention[:, :, -1, :] | |
| # Stabilisiere die Logarithmus-Berechnung | |
| attention_probs = attention_probs + 1e-9 | |
| # Entropie-Formel: - sum(p * log2(p)) | |
| log_probs = torch.log2(attention_probs) | |
| entropy_per_head = -torch.sum(attention_probs * log_probs, dim=-1) | |
| total_entropy += torch.sum(entropy_per_head).item() | |
| num_heads += attention_probs.shape[1] | |
| return total_entropy / num_heads if num_heads > 0 else 0.0 | |
| @torch.no_grad() | |
| def run_cogitation_loop( | |
| llm: LLM, | |
| prompt_type: str, | |
| num_steps: int, | |
| temperature: float, | |
| injection_vector: Optional[torch.Tensor] = None, | |
| injection_strength: float = 0.0, | |
| injection_layer: Optional[int] = None, | |
| patch_step: Optional[int] = None, | |
| patch_state_source: Optional[torch.Tensor] = None, | |
| reset_kv_cache_on_patch: bool = False, | |
| record_states: bool = False, | |
| record_attentions: bool = False, | |
| ) -> Dict[str, Any]: | |
| """ | |
| Eine verallgemeinerte Version, die nun auch die Aufzeichnung von Attention-Mustern | |
| und die Berechnung der Entropie unterstΓΌtzt. | |
| """ | |
| prompt = RESONANCE_PROMPTS[prompt_type] | |
| inputs = llm.tokenizer(prompt, return_tensors="pt").to(llm.model.device) | |
| outputs = llm.model(**inputs, output_hidden_states=True, use_cache=True, output_attentions=record_attentions) | |
| hidden_state_2d = outputs.hidden_states[-1][:, -1, :] | |
| kv_cache = outputs.past_key_values | |
| state_deltas: List[float] = [] | |
| state_history: List[torch.Tensor] = [] | |
| attention_entropies: List[float] = [] | |
| if record_attentions and outputs.attentions: | |
| attention_entropies.append(_calculate_attention_entropy(outputs.attentions)) | |
| for i in tqdm(range(num_steps), desc=f"Cognitive Loop ({prompt_type})", leave=False, bar_format="{l_bar}{bar:10}{r_bar}"): | |
| if i == patch_step and patch_state_source is not None: | |
| dbg(f"--- Applying Causal Surgery at step {i}: Patching state. ---") | |
| hidden_state_2d = patch_state_source.clone().to(device=llm.model.device, dtype=llm.model.dtype) | |
| if reset_kv_cache_on_patch: | |
| dbg("--- KV-Cache has been RESET as part of the intervention. ---") | |
| kv_cache = None | |
| if record_states: | |
| state_history.append(hidden_state_2d.cpu()) | |
| next_token_logits = llm.model.lm_head(hidden_state_2d) | |
| temp_to_use = temperature if temperature > 0.0 else 1.0 | |
| probabilities = torch.nn.functional.softmax(next_token_logits / temp_to_use, dim=-1) | |
| if temperature > 0.0: | |
| next_token_id = torch.multinomial(probabilities, num_samples=1) | |
| else: | |
| next_token_id = torch.argmax(probabilities, dim=-1).unsqueeze(-1) | |
| hook_handle = None | |
| if injection_vector is not None and injection_strength > 0: | |
| injection_vector = injection_vector.to(device=llm.model.device, dtype=llm.model.dtype) | |
| if injection_layer is None: | |
| injection_layer = llm.stable_config.num_layers // 2 | |
| def injection_hook(module: Any, layer_input: Any) -> Any: | |
| seq_len = layer_input[0].shape[1] | |
| injection_3d = injection_vector.unsqueeze(0).expand(1, seq_len, -1) | |
| modified_hidden_states = layer_input[0] + (injection_3d * injection_strength) | |
| return (modified_hidden_states,) + layer_input[1:] | |
| try: | |
| if injection_vector is not None and injection_strength > 0 and injection_layer is not None: | |
| assert 0 <= injection_layer < llm.stable_config.num_layers, f"Injection layer {injection_layer} is out of bounds." | |
| target_layer = llm.stable_config.layer_list[injection_layer] | |
| hook_handle = target_layer.register_forward_pre_hook(injection_hook) | |
| outputs = llm.model( | |
| input_ids=next_token_id, past_key_values=kv_cache, | |
| output_hidden_states=True, use_cache=True, | |
| output_attentions=record_attentions | |
| ) | |
| finally: | |
| if hook_handle: | |
| hook_handle.remove() | |
| hook_handle = None | |
| new_hidden_state = outputs.hidden_states[-1][:, -1, :] | |
| kv_cache = outputs.past_key_values | |
| if record_attentions and outputs.attentions: | |
| attention_entropies.append(_calculate_attention_entropy(outputs.attentions)) | |
| delta = torch.norm(new_hidden_state - hidden_state_2d).item() | |
| state_deltas.append(delta) | |
| hidden_state_2d = new_hidden_state.clone() | |
| dbg(f"Cognitive loop finished after {num_steps} steps.") | |
| return { | |
| "state_deltas": state_deltas, | |
| "state_history": state_history, | |
| "attention_entropies": attention_entropies, | |
| "final_hidden_state": hidden_state_2d, | |
| "final_kv_cache": kv_cache, | |
| } | |
| def run_silent_cogitation_seismic( | |
| llm: LLM, | |
| prompt_type: str, | |
| num_steps: int, | |
| temperature: float, | |
| injection_vector: Optional[torch.Tensor] = None, | |
| injection_strength: float = 0.0, | |
| injection_layer: Optional[int] = None | |
| ) -> List[float]: | |
| """ | |
| Ein abwΓ€rtskompatibler Wrapper, der die alte, einfachere Schnittstelle beibehΓ€lt. | |
| Ruft den neuen, verallgemeinerten Loop auf und gibt nur die Deltas zurΓΌck. | |
| """ | |
| results = run_cogitation_loop( | |
| llm=llm, prompt_type=prompt_type, num_steps=num_steps, temperature=temperature, | |
| injection_vector=injection_vector, injection_strength=injection_strength, | |
| injection_layer=injection_layer | |
| ) | |
| return results["state_deltas"] | |
| [File Ends] cognitive_mapping_probe/resonance_seismograph.py | |
| [File Begins] cognitive_mapping_probe/signal_analysis.py | |
| import numpy as np | |
| from scipy.fft import rfft, rfftfreq | |
| from scipy.signal import find_peaks | |
| from typing import Dict, List, Optional, Any, Tuple | |
| def analyze_cognitive_signal( | |
| state_deltas: np.ndarray, | |
| sampling_rate: float = 1.0, | |
| num_peaks: int = 3 | |
| ) -> Dict[str, Any]: | |
| """ | |
| FΓΌhrt eine polyrhythmische Spektralanalyse mit einer robusten, | |
| zweistufigen Schwellenwert-Methode durch. | |
| """ | |
| analysis_results: Dict[str, Any] = { | |
| "dominant_periods_steps": None, | |
| "spectral_entropy": None, | |
| } | |
| if len(state_deltas) < 20: | |
| return analysis_results | |
| n = len(state_deltas) | |
| yf = rfft(state_deltas - np.mean(state_deltas)) | |
| xf = rfftfreq(n, 1 / sampling_rate) | |
| power_spectrum = np.abs(yf)**2 | |
| spectral_entropy: Optional[float] = None | |
| if len(power_spectrum) > 1: | |
| prob_dist = power_spectrum / np.sum(power_spectrum) | |
| prob_dist = prob_dist[prob_dist > 1e-12] | |
| spectral_entropy = -np.sum(prob_dist * np.log2(prob_dist)) | |
| analysis_results["spectral_entropy"] = float(spectral_entropy) | |
| # FINALE KORREKTUR: Robuste, zweistufige Schwellenwert-Bestimmung | |
| if len(power_spectrum) > 1: | |
| # 1. Absolute HΓΆhe: Ein Peak muss signifikant ΓΌber dem Median-Rauschen liegen. | |
| min_height = np.median(power_spectrum) + np.std(power_spectrum) | |
| # 2. Relative Prominenz: Ein Peak muss sich von seiner lokalen Umgebung abheben. | |
| min_prominence = np.std(power_spectrum) * 0.5 | |
| else: | |
| min_height = 1.0 | |
| min_prominence = 1.0 | |
| peaks, properties = find_peaks(power_spectrum[1:], height=min_height, prominence=min_prominence) | |
| if peaks.size > 0 and "peak_heights" in properties: | |
| sorted_peak_indices = peaks[np.argsort(properties["peak_heights"])[::-1]] | |
| dominant_periods = [] | |
| for i in range(min(num_peaks, len(sorted_peak_indices))): | |
| peak_index = sorted_peak_indices[i] | |
| frequency = xf[peak_index + 1] | |
| if frequency > 1e-9: | |
| period = 1 / frequency | |
| dominant_periods.append(round(period, 2)) | |
| if dominant_periods: | |
| analysis_results["dominant_periods_steps"] = dominant_periods | |
| return analysis_results | |
| def get_power_spectrum_for_plotting(state_deltas: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: | |
| """ | |
| Berechnet das Leistungsspektrum und gibt Frequenzen und Power zurΓΌck. | |
| """ | |
| if len(state_deltas) < 10: | |
| return np.array([]), np.array([]) | |
| n = len(state_deltas) | |
| yf = rfft(state_deltas - np.mean(state_deltas)) | |
| xf = rfftfreq(n, 1.0) | |
| power_spectrum = np.abs(yf)**2 | |
| return xf, power_spectrum | |
| [File Ends] cognitive_mapping_probe/signal_analysis.py | |
| [File Begins] cognitive_mapping_probe/utils.py | |
| import os | |
| import sys | |
| import gc | |
| import torch | |
| # --- Centralized Debugging Control --- | |
| DEBUG_ENABLED = os.environ.get("CMP_DEBUG", "0") == "1" | |
| def dbg(*args, **kwargs): | |
| """A controlled debug print function.""" | |
| if DEBUG_ENABLED: | |
| print("[DEBUG]", *args, **kwargs, file=sys.stderr, flush=True) | |
| # --- NEU: Zentrale Funktion zur Speicherbereinigung --- | |
| def cleanup_memory(): | |
| """ | |
| Eine zentrale, global verfΓΌgbare Funktion zum AufrΓ€umen von CPU- und GPU-Speicher. | |
| Dies stellt sicher, dass die Speicherverwaltung konsistent und an einer einzigen Stelle erfolgt. | |
| """ | |
| dbg("Cleaning up memory (centralized)...") | |
| # Python's garbage collector | |
| gc.collect() | |
| # PyTorch's CUDA cache | |
| if torch.cuda.is_available(): | |
| torch.cuda.empty_cache() | |
| dbg("Memory cleanup complete.") | |
| [File Ends] cognitive_mapping_probe/utils.py | |
| [File Begins] run_test.sh | |
| #!/bin/bash | |
| # Dieses Skript fΓΌhrt die Pytest-Suite mit aktivierten Debug-Meldungen aus. | |
| # Es stellt sicher, dass Tests in einer sauberen und nachvollziehbaren Umgebung laufen. | |
| # FΓΌhren Sie es vom Hauptverzeichnis des Projekts aus: ./run_tests.sh | |
| echo "=========================================" | |
| echo "π¬ Running Cognitive Seismograph Test Suite" | |
| echo "=========================================" | |
| # Aktiviere das Debug-Logging fΓΌr unsere Applikation | |
| export CMP_DEBUG=1 | |
| # FΓΌhre Pytest aus | |
| # -v: "verbose" fΓΌr detaillierte Ausgabe pro Test | |
| # --color=yes: Erzwingt farbige Ausgabe fΓΌr bessere Lesbarkeit | |
| #python -m pytest -v --color=yes tests/ | |
| ../venv-gemma-qualia/bin/python -m pytest -v --color=yes tests/ | |
| # ΓberprΓΌfe den Exit-Code von pytest | |
| if [ $? -eq 0 ]; then | |
| echo "=========================================" | |
| echo "β All tests passed successfully!" | |
| echo "=========================================" | |
| else | |
| echo "=========================================" | |
| echo "β Some tests failed. Please review the output." | |
| echo "=========================================" | |
| fi | |
| [File Ends] run_test.sh | |
| [File Begins] tests/conftest.py | |
| import pytest | |
| @pytest.fixture(scope="session") | |
| def model_id() -> str: | |
| """ | |
| Stellt die ID des realen Modells bereit, das fΓΌr die Integrations-Tests verwendet wird. | |
| """ | |
| return "google/gemma-3-1b-it" | |
| [File Ends] tests/conftest.py | |
| [File Begins] tests/test_app_logic.py | |
| import pandas as pd | |
| import pytest | |
| import gradio as gr | |
| from pandas.testing import assert_frame_equal | |
| from unittest.mock import MagicMock | |
| from app import run_single_analysis_display, run_auto_suite_display | |
| def test_run_single_analysis_display(mocker): | |
| """Testet den UI-Wrapper fΓΌr Einzel-Experimente mit korrekten Datenstrukturen.""" | |
| mock_results = { | |
| "verdict": "V", | |
| "stats": { | |
| "mean_delta": 1.0, "std_delta": 0.5, | |
| "dominant_periods_steps": [10.0, 5.0], "spectral_entropy": 3.5 | |
| }, | |
| "state_deltas": [1.0, 2.0], | |
| "power_spectrum": {"frequencies": [0.1, 0.2], "power": [100, 50]} | |
| } | |
| mocker.patch('app.run_seismic_analysis', return_value=mock_results) | |
| verdict, df_time, df_freq, raw = run_single_analysis_display(progress=MagicMock()) | |
| # FINALE KORREKTUR: Passe die Assertion an den exakten Markdown-Output-String an. | |
| assert "- **Dominant Periods:** 10.0, 5.0 Steps/Cycle" in verdict | |
| assert "Period (Steps/Cycle)" in df_freq.columns | |
| def test_run_auto_suite_display_generates_valid_plot_data(mocker): | |
| """Verifiziert die DatenΓΌbergabe an die Gradio-Komponenten fΓΌr Auto-Experimente.""" | |
| mock_summary_df = pd.DataFrame([{"Experiment": "A", "Mean Delta": 150.0}]) | |
| mock_plot_df_time = pd.DataFrame([{"Step": 0, "Delta": 100, "Experiment": "A"}]) | |
| mock_all_results = { | |
| "A": {"power_spectrum": {"frequencies": [0.1], "power": [1000]}} | |
| } | |
| mocker.patch('app.run_auto_suite', return_value=(mock_summary_df, mock_plot_df_time, mock_all_results)) | |
| dataframe_comp, time_plot_comp, freq_plot_comp, raw_json = run_auto_suite_display( | |
| "mock-model", 10, 42, "Causal Verification & Crisis Dynamics", progress=MagicMock() | |
| ) | |
| assert isinstance(dataframe_comp.value, dict) | |
| assert_frame_equal(pd.DataFrame(dataframe_comp.value['data'], columns=dataframe_comp.value['headers']), mock_summary_df) | |
| assert time_plot_comp.y == "Delta" | |
| assert "Period (Steps/Cycle)" in freq_plot_comp.x | |
| [File Ends] tests/test_app_logic.py | |
| [File Begins] tests/test_components.py | |
| import torch | |
| import numpy as np | |
| from cognitive_mapping_probe.llm_iface import get_or_load_model | |
| from cognitive_mapping_probe.resonance_seismograph import run_silent_cogitation_seismic | |
| from cognitive_mapping_probe.concepts import get_concept_vector, _get_last_token_hidden_state | |
| from cognitive_mapping_probe.signal_analysis import analyze_cognitive_signal | |
| def test_get_or_load_model_loads_correctly(model_id): | |
| """Testet, ob das Laden eines echten Modells funktioniert.""" | |
| llm = get_or_load_model(model_id, seed=42) | |
| assert llm is not None | |
| assert llm.model_id == model_id | |
| assert llm.stable_config.hidden_dim > 0 | |
| assert llm.stable_config.num_layers > 0 | |
| def test_run_silent_cogitation_seismic_output_shape_and_type(model_id): | |
| """FΓΌhrt einen kurzen Lauf mit einem echten Modell durch und prΓΌft die Datentypen.""" | |
| num_steps = 10 | |
| llm = get_or_load_model(model_id, seed=42) | |
| state_deltas = run_silent_cogitation_seismic( | |
| llm=llm, prompt_type="control_long_prose", | |
| num_steps=num_steps, temperature=0.1 | |
| ) | |
| assert isinstance(state_deltas, list) | |
| assert len(state_deltas) == num_steps | |
| assert all(isinstance(d, float) for d in state_deltas) | |
| def test_get_last_token_hidden_state_robustness(model_id): | |
| """Testet die Helper-Funktion mit einem echten Modell.""" | |
| llm = get_or_load_model(model_id, seed=42) | |
| hs = _get_last_token_hidden_state(llm, "test prompt") | |
| assert isinstance(hs, torch.Tensor) | |
| assert hs.shape == (llm.stable_config.hidden_dim,) | |
| def test_get_concept_vector_logic(model_id): | |
| """Testet die Vektor-Extraktion mit einem echten Modell.""" | |
| llm = get_or_load_model(model_id, seed=42) | |
| vector = get_concept_vector(llm, "love", baseline_words=["thing", "place"]) | |
| assert isinstance(vector, torch.Tensor) | |
| assert vector.shape == (llm.stable_config.hidden_dim,) | |
| def test_analyze_cognitive_signal_no_peaks(): | |
| """ | |
| Testet den Edge Case, dass ein Signal keine signifikanten Frequenz-Peaks hat. | |
| """ | |
| flat_signal = np.linspace(0, 1, 100) | |
| results = analyze_cognitive_signal(flat_signal) | |
| assert results is not None | |
| assert results["dominant_periods_steps"] is None | |
| assert "spectral_entropy" in results | |
| def test_analyze_cognitive_signal_with_peaks(): | |
| """ | |
| Testet den Normalfall, dass ein Signal Peaks hat, mit realistischerem Rauschen. | |
| """ | |
| np.random.seed(42) | |
| steps = np.arange(200) | |
| # Signal mit einer starken Periode von 10 und einer schwΓ€cheren von 25 | |
| signal_with_peak = (1.0 * np.sin(2 * np.pi * (1/10.0) * steps) + | |
| 0.5 * np.sin(2 * np.pi * (1/25.0) * steps) + | |
| np.random.randn(200) * 0.5) # Realistischeres Rauschen | |
| results = analyze_cognitive_signal(signal_with_peak) | |
| assert results["dominant_periods_steps"] is not None | |
| assert 10.0 in results["dominant_periods_steps"] | |
| assert 25.0 in results["dominant_periods_steps"] | |
| def test_analyze_cognitive_signal_with_multiple_peaks(): | |
| """ | |
| Erweiterter Test, der die korrekte Identifizierung und Sortierung | |
| von drei Peaks verifiziert, mit realistischerem Rauschen. | |
| """ | |
| np.random.seed(42) | |
| steps = np.arange(300) | |
| # Definiere drei Peaks mit unterschiedlicher StΓ€rke (Amplitude) | |
| signal = (2.0 * np.sin(2 * np.pi * (1/10.0) * steps) + | |
| 1.5 * np.sin(2 * np.pi * (1/4.0) * steps) + | |
| 1.0 * np.sin(2 * np.pi * (1/30.0) * steps) + | |
| np.random.randn(300) * 0.5) # Realistischeres Rauschen | |
| results = analyze_cognitive_signal(signal, num_peaks=3) | |
| assert results["dominant_periods_steps"] is not None | |
| expected_periods = [10.0, 4.0, 30.0] | |
| assert results["dominant_periods_steps"] == expected_periods | |
| [File Ends] tests/test_components.py | |
| [File Begins] tests/test_orchestration.py | |
| import pandas as pd | |
| from cognitive_mapping_probe.auto_experiment import run_auto_suite, get_curated_experiments | |
| from cognitive_mapping_probe.orchestrator_seismograph import run_seismic_analysis | |
| def test_run_seismic_analysis_with_real_model(model_id): | |
| """FΓΌhrt einen einzelnen Orchestrator-Lauf mit einem echten Modell durch.""" | |
| results = run_seismic_analysis( | |
| model_id=model_id, | |
| prompt_type="resonance_prompt", | |
| seed=42, | |
| num_steps=3, | |
| concept_to_inject="", | |
| injection_strength=0.0, | |
| progress_callback=lambda *args, **kwargs: None | |
| ) | |
| assert "verdict" in results | |
| assert "stats" in results | |
| assert len(results["state_deltas"]) == 3 | |
| def test_get_curated_experiments_structure(): | |
| """ΓberprΓΌft die Struktur der Experiment-Definitionen.""" | |
| experiments = get_curated_experiments() | |
| assert isinstance(experiments, dict) | |
| assert "Causal Verification & Crisis Dynamics" in experiments | |
| def test_run_auto_suite_special_protocol(mocker, model_id): | |
| """Testet den speziellen Logikpfad, mockt aber die langwierigen Aufrufe.""" | |
| mocker.patch('cognitive_mapping_probe.auto_experiment.run_seismic_analysis', return_value={"stats": {}, "state_deltas": [1.0]}) | |
| summary_df, plot_df, all_results = run_auto_suite( | |
| model_id=model_id, num_steps=2, seed=42, | |
| experiment_name="Sequential Intervention (Self-Analysis -> Deletion)", | |
| progress_callback=lambda *args, **kwargs: None | |
| ) | |
| assert isinstance(summary_df, pd.DataFrame) | |
| assert len(summary_df) == 2 | |
| assert "1: Self-Analysis + Calmness Injection" in summary_df["Experiment"].values | |
| [File Ends] tests/test_orchestration.py | |
| <-- File Content Ends | |