Spaces:

dev-jas
/

polymer-aging-ml

Sleeping

App Files Files Community

devjas1 commited on Aug 24

Commit

df6f1ab

1 Parent(s): 2906ad6

(DEPLOY/CHORE): space-deploy branch cleaned up

Browse files

- Slimmed to avoid unnecessary rebuild triggers
- Rebuilt automatically by Spaces on push

Files changed (9) hide show

MANIFEST.git +0 -26
docs/BACKEND_MIGRATION_LOG.md +0 -60
docs/ENVIRONMENT_GUIDE.md +0 -119
docs/HPC_REMOTE_SETUP.md +0 -111
docs/LICENSE +0 -21
docs/PROJECT_TIMELINE.md +0 -156
docs/REPRODUCIBILITY.md +0 -132
docs/sprint_log.md +0 -31
validate_pipeline.sh +0 -67

MANIFEST.git DELETED Viewed

@@ -1,26 +0,0 @@
-100644 cfc04f24571aecd66e900d29bd94a311bb2e1111 0	.gitignore
-100644 261eeb9e9f8b2b4b0d119366dda99c6fd7d35c64 0	LICENSE
-100644 1bab8241aa1dd7325c43ed937510646c9de50759 0	README.md
-100644 2e5679a509eed8c11a522df1d5e7fc89f2a95da6 0	app/ui_app.py
-100644 c18dd8d83ceed1806b50b0aaa46beb7e335fff13 0	backend/.gitignore
-100644 a23d89493ce1a6557368b8424e00d2b0a564deeb 0	backend/inference_utils.py
-100644 b1eb4466d4ef220de29438d4b32de8fea1950687 0	backend/main.py
-100644 df16ea87b7dfe3601ed0aa15fa8a563549b71502 0	dashboard/app.py
-100644 08c771b09d2833a50c101294e7e56f24068f1fed 0	docs/BACKEND_MIGRATION_LOG.md
-100644 ededbc2c9d95767d04c0d6a63d6e9b2cd77432d9 0	docs/ENVIRONMENT_GUIDE.md
-100644 9838325b68b604409bb463a449fd850b57674342 0	docs/HPC_REMOTE_SETUP.md
-100644 02e4813140e43533cabe01bc2a816f3740260f76 0	docs/LICENSE
-100644 56b984636170bd2178c77c4933f753e2afb8a65f 0	docs/PROJECT_TIMELINE.md
-100644 f6376d986813734264e70c6cbf45fbf2c40f82c1 0	docs/REPRODUCIBILITY.md
-100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	models/__init__.py
-100644 704266bdfe413b4b0a77879a1c66878ce6eab0dd 0	models/figure2_cnn.py
-100644 9104b59d82ccd5d36ad4ec47f57e3b5ca0fc80aa 0	models/resnet_cnn.py
-100644 ce43e850e5f7890d9e324e8c41e3b8f9fdc3a832 0	outputs/resnet_model.pth
-100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0	scripts/__init__.py
-100644 c1f269413e6c89dbef52221db02b5490e7e8d95d 0	scripts/discover_raman_files.py
-100644 6cc73d3ce4e95109ec51583c9831e4f72b4c8a82 0	scripts/list_spectra.py
-100644 136ca3d29e3406a9bed641e19c54da966068e95e 0	scripts/plot_spectrum.py
-100644 c59c21dfd5359e6a3a34199088ff15199c0649f5 0	scripts/preprocess_dataset.py
-100644 77267ae19fcd0fb0e11c664d451d7b7395cd3f30 0	scripts/run_inference.py
-100644 a33fc333522a302cfad5e4139202f3e9cf416921 0	scripts/train_model.py
-100644 5aa6bedb24c303e2fed8554e40a2bbf965616200 0	validate_pipeline.sh

docs/BACKEND_MIGRATION_LOG.md DELETED Viewed

@@ -1,60 +0,0 @@
-# BACKEND_MIGRATION_LOG.md
-## 📌 Overview
-This document tracks the migration of the inference logic from a monolithic Streamlit app to a modular, testable FastAPI backend for the Polymer AI Aging Prediction System
----
-## ✅ Completed Work
-## 1. Initial Setup
-- Installed `fastapi`, `uvicorn`, and set up basic FastAPI app in `main.py`.
-### 2. Modular Inference Utilities
-- Moved `load_model()` and `run_inference()` into `backend/inference_utils.py`.
-- Separated model configuration for Figure2CNN and ResNet1D.
-- Applied proper preprocessing (resampling, normalization) inside `run_inference()`.
-### 3. API Endpoint
-- `/infer` route accepts JSON payloads with `model_name` and `spectrum`.
-- Returns: full prediction dictionary with class index, logits, and label map.
-### 4. Validation + Testing
-- Tested manually in Python REPL.
-- Tested via `curl`:
-  ```bash
-  curl -X POST  -H "Content-Type: application/json" -d @backend/test_payload.json
-  ```
----
-## 🛠 Fixes & Breakpoints Resolved
-- ✅ Fixed incorrect model path ("models/" → "outputs/")
-- ✅ Corrected unpacking bug in `main.py` → now returns full result dict
-- ✅ Replaced invalid `tolist()` call on string-typed logits
-- ✅ Manually verified output from CLI and curl
----
-## 🧪 Next Focus: Robustness Testing
-- Invalid `model_name` handling
-- Short/empty spectrum validation
-- ResNet model loading test
-- JSON schema validation for input
-- Unit tests via `pytest` or integration test runner
----
-## 🔄 Future Enhancements
-- Modular model registry (for adding more model classes easily)
-- Add OpenAPI schema and example payloads for documentation
-- Enable batch inference or upload support

docs/ENVIRONMENT_GUIDE.md DELETED Viewed

@@ -1,119 +0,0 @@
-# 🔧 Environment Management Guide
-## AI-Driven Polymer Aging Prediction and Classification System
-**Maintainer:** Jaser Hasan
-**Snapshot:** `@artifact-isolation-complete`
-**Last Updated:** 2025-06-26
-**Environments:** Conda (local) + venv on `/scratch` (HPC)
----
-## 🧠 Overview
-This guide describes how to set up and activate the Python environments required to run the Raman pipeline on both:
-- **Local Systems:** (Mac/Windows/Linux)
-- **CWRU Pioneer HPC:** (GPU nodes, venv based)
-This guide documents the environment structure and the divergence between the **local Conda environment (`polymer_env`)** and the **HPC Python virtual environment (`polymer_venv`)**.
----
-## 📁 Environment Overview
-| Platform | Environment | Manager | Path | Notes |
-|----------|-------------|---------|------|-------|
-| Local (dev) | `polymer_env` | **Conda** | `~/miniconda3/envs/polymer_env` | Primary for day-to-day development |
-| HPC (Pioneer) | `polymer_venv` | **venv** (Python stdlib) | `/scratch/users/<case_id>/polymer_project/polymer_venv` | Created under `/scratch` to avoid `/home` quota limits |
----
-## 💻 Local Installation (Conda)
-```bash
-git clone https://github.com/dev-jaser/ai-ml-polymer-aging-prediction.git
-cd polymer_project
-conda env create -f environment.yml
-conda activate polymer_env
-python -c "import torch, sys; print('PyTorch:', torch.__version__, 'Python', sys.version")
-```
-> **Tip:** Keep Conda updated ('conda update conda') to reduce solver errors issues.
----
-## 🚀 CWRU Pioneer HPC Setup (venv + pip)
-> Conda is intentionally **not** used on Pioneer due to prior codec and disk-quota
-### 1. Load Python Module
-```bash
-module purge
-module load Python/3.12.3-GCCcore-13.2.0
-```
-### 2. Create Working Directory in `/scratch`
-```bash
-mkdir -p /scratch/users/<case_id>/polymer_project_runtime
-cd /scratch/users/<case_id>/polymer_project_runtime
-git clone https://github.com/dev-jaser/ai-ml-polymer-aging-prediction.git
-```
-### 3. Create & Activate Virtual Environment
-```bash
-python3 -m venv polymer_env
-source polymer_env/bin/activate
-```
-### 4. Install Dependencies
-```bash
-pip install --upgrade pip
-pip install -r environment_hpc.yml      # Optimized dependencies list for Pioneer
-```
-(Optional) Save a reproducible freeze:
-```bash
-pip freeze > requirements_hpc.txt
-```
----
-## ✅ Supported CLI Workflows (Raman-only)
-| Script | Purpose |
-|--------|---------|
-| `scripts/train_model.py` | 10-fold CV training ('--model figure2' or 'resnet') |
-| `scripts/run_inference.py` | Predict single Raman spectrum |
-| `scripts/preprocess_dataset.py` | Apply full preprocessing chain |
-| `scripts/plot_spectrum.py` | Quick spectrum visualization (.png) |
-> FTIR-related scripts are archived and *not installed* into the active environments.
----
-## 🔁 Cross-Environment Parity
-- Package sets in environment.yml and environment_hpc.yml are aligned.
-- Diagnostics JSON structure and checkpoint filenames are identical on both systems.
-- Training commands are copy-paste compatible between local shell and HPC login shell.
----
-## 📦 Best Practices
-- **Local:** use Conda for rapid iteration, notebook work, small CPU inference.
-- **HPC:** use venv in  `/scratch` for GPU training, never install large packages into `/home` (`'~/'`)
-- Keep environments lightweight; remove unused libraries to minimize rebuild time.
-- Update this guide if either environment definition changes.

docs/HPC_REMOTE_SETUP.md DELETED Viewed

@@ -1,111 +0,0 @@
-# Accessing CWRU Pioneer HPC System Remotely via SSH (PuTTY)
-## Step 1: Set up DUO Authentication for VPN Access
-### 1. Enroll in DUO (if not already done):
-> - Go to [case.edu/utech/duo](https://case.edu/utech/duo) and follow instructions to register your device (phone/tablet/hardward token)
-> - This is required for FortiClient VPN authentication.
----
-## Step 2: Install and Configure FortiClient VPN
-### 1. Download FortiClient VPN:
-- Visit [case.edu/utech/help/forticlient-vpn](https://case.edu/utech/help/forticlient-vpn)
-- Download the **FortiClient VPN** software for your specific device.
-### 2. Install & Configure VPN
-- Run the installer and complete setup
-- Open FortiClient and configure new connection:
-  - **Connection Name**: `CWRU VPN` (or any name)
-  - **Remote Gateway**: `vpn.case.edu`
-  - **Customize Port**: `443`
-  - Enable "**Save Credentials**" (optional)
-- Click **Save**
-### 3. Connect to VPN:
-- Enter your **CWRU Network ID** (e.g., `jxh369`) and password.
-- Complete **DUO two-factor authentication** when prompted (approve via phone/device)
-- Once connected, you'll see a confirmation message.
----
-## Step 3: Install PuTTY (SSH Client)
-### 1. Download PuTTY:
-- If not installed, download from [https://www.putty.org](https://www.putty.org)
-- Run the installer (or use the portable version).
-## 2. Open PuTTY:
-- Launch PuTTY from the Start Menu
----
-## Step 4: Configure PuTTY for Pioneer HPC
-### 1. Enter Connection Details:
-- **Host Name (or IP address)**: `pioneer.case.edu`
-- **Port**: `22`
-- **Connection Type**: SSH
-### 2. Optional: Save Session (for future use):
-- Under "**Saved Sessions**", type `Pioneer HPC` and click **Save**
-### 3. Click "Open" to initiate the connection
----
-## Step 5: Log In via SSH
-### 1. Enter Credentials:
-- When prompted, enter your **CWRU Network ID** (e.g., `jxh369`)
-- Enter your password (same as VPN/CWRU login)
-- Complete DUO authentication again if required
-### 2. Successful Login:
-- You should now see the **Pioneer HPC command-line interface**
----
-## Step 6: Disconnecting
-### 1. Exit SSH Session:
-- Type `exit` or `logout` in the terminal
-### 2. Disconnect VPN:
-- Close PuTTY and disconnect FortiClient VPN when done.
----
-## Troubleshooting Tips
-### VPN Fails?
-- Ensure DUO is set up correctly
-- Try reconnecting or restarting FortiClient VPN
-### PuTTY Connection Refused?
-- Verify VPN is active (`vpn.case.edu` shows "**Connected**")
-- Check `pioneer.case.edu` and port `22` are correct
-## DUO Not Prompting?
-- Ensure your device is registered in DUO
-## Extra Help on CWRU HPC Systems
-[https://sites.google.com/a/case.edu/hpcc/](https://sites.google.com/a/case.edu/hpcc/)

docs/LICENSE DELETED Viewed

@@ -1,21 +0,0 @@
-MIT License
-Copyright (c) 2025 dev-jaser
-Permission is hereby granted, free of charge, to any person obtaining a copy
-of this software and associated documentation files (the "Software"), to deal
-in the Software without restriction, including without limitation the rights
-to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-copies of the Software, and to permit persons to whom the Software is
-furnished to do so, subject to the following conditions:
-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.

docs/PROJECT_TIMELINE.md DELETED Viewed

@@ -1,156 +0,0 @@
-# 📅 PROJECT_TIMELINE.md
-## AI-Driven Polymer Aging Prediction and Classification System
-**Intern:** Jaser Hasan
-### ✅ PHASE 1 – Project Kickoff and Faculty Guidance
-**Tag:** `@project-init-complete`
-Received first set of research tasks from Prof. Kuppannagari
-- Reeived research plan
-- Objectives defined: download datasets, analyze spectra, implement CNN, run initial inference
----
-### ✅ PHASE 2 – Dataset Acquisition (Local System)
-**Tag:** `@data-downloaded`
-- Downloaded Raman `.txt`  (RDWP) and FTIR `.csv` data (polymer packaging)
-- Structured into:
-- `datasets/rdwp`
-- `datasets/ftir`
----
-### ✅ PHASE 3 – Data Exploration & Spectral Validation
-**Tag:** `@data-exploration-complete`
-- Built plotting tools for Raman and FTIR
-- Validated spectrum structure, removed malformed samples
-- Observed structural inconsistencies in FTIR multi-layer grouping
----
-### ✅ PHASE 4 – Preprocessing Pipeline Implementation
-**Tag:** `@data-prep`
-- Implemented `preprocess_dataset.py` for Raman
-- Applied: Resampling -> Baseline correction -> Smoothing -> Normalization
-- Confirmed reproducible input/output behavior and dynamic CLI control
-### ✅ PHASE 5 – Figure2CNN Architecture Build
-**Tag:** `@figure2cnn-complete`
-- Constructed `Figure2CNN` modeled after Figure 2 CNN from research paper
-- `Figure2CNN`: 4 conv layers + 3 FC layers
-- Verified dynamic input length handling (e.g., 500, 1000, 4000)
----
-### ✅ PHASE 6 – Local Training and Inference
-**Tag:** `@figure2cnn-training-local`
-- Trained Raman models locally (FTIR now deferred)
-- Canonical Raman accuracy: **87.29% ± 6.30%**
-- FTIR accuracy results archived and excluded from current validation
-- CLI tools for training, inference, plotting implemented
----
-### ✅ PHASE 7 –  Reproducibility and Documentation Setup
-**Tag:** `@project-docs-started`
-- Authored `README.md`, `PROJECT_REPORT.md`, and `ENVIRONMENT_GUIDE.md`
-- Defined reproducibility guidelines
-- Standardized project structure and versioning
----
-### ✅ PHASE 8 – HPC Access and Venv Strategy
-**Tag:** `@hpc-login-successful`
-- Logged into CWRU Pioneer (SSH via PuTTY)
-- Setup up FortiClient VPN as it is required to access Pioneer remotely
-- Explored module system; selected venv over Conda for compatibility
-- Loaded Python 3.12.3 + created `polymer_env`
----
-### ✅ PHASE 9 – HPC Environment Sync
-**Tag:** `@venv-alignment-complete`
-- Created `environment_hpc.yml`
-- Installed dependencies into `polymer_env`
-- Validated imports, PyTorch installation, and CLI script execution
----
-### ✅ PHASE 10 – Full Instruction Validation on HPC
-**Tag:** `@prof-k-instruction-validation-complete`
-- Ran Raman preprocessing and plotting scripts
-- Executed `run_inference.py` with CLI on raw Raman `.txt` file
-- Verified consistent predictions and output logging across local and HPC
----
-### ✅ PHASE 11 – FTIR Path Paused, Raman Declared Primary
-**Tag:** `@raman-pipeline-focus-milestone`
-- FTIR modeling formally deferred
-- FTIR preprocessing scripts preserved and archived for future use
-- All resources directed toward Raman pipeline finalization
-- Saliency, FTIR ingestion, and `train_ftir_model.py` archived
----
-### ✅ PHASE 12 – ResNet1D Prototyping & Benchmark Setup
-**Tag:** `@resnet-prototype-complete`
-- Built `ResNet1D` architecture in `models/resnet_cnn.py`
-- Integrated `train_model.py` via `--model resnet`
-- Ran initial CV training with successful results
----
-### ✅ PHASE 13 – Output Artifact Isolation
-**Tag:** `@artifact-isolation-complete`
-- Patched `train_model.py` to save:
-  - `figure2_model.pth`, `resnet_model.pth`
-  - `raman_figure2_diagnostics.json`. `raman_resnet_diagnostics.json`
-- Prevented all overwrites by tying output filenames to `args.model`
-- Snapshotted as reproducibility milestone. Enabled downstream validation harness.
-### ✅ PHASE 14 – Canonical Validation Achieved
-**Tag:** `@validation-loop-complete`
-- Created `validate_pipeline.sh` to verify preprocessing, training, inferece, plotting
-- Ran full validation using `Figure2CNN` with reproducible CLI config
-- All ouputs verified: logs, artifacts, predictions, plots
-- Declared Raman pipeline scientifically validated and stable
----
-### ⏭️ NEXT - Results Analysis & Finalization
-- Analyze logged diagnostics for both models
-- Conduct optional hyperparameter tuning (batch size, LR)
-- Begin deliverable prep: visuals, posters, cards
-- Resume FTIR work only after Raman path is fully stablized and documented & open FTIR conceptual error is resolved

docs/REPRODUCIBILITY.md DELETED Viewed

@@ -1,132 +0,0 @@
-# 📚 REPRODUCIBILITY.md
-*AI-Driven Polymer Aging Prediction & Classification System*
-*(Canonical Raman-only Pipeline)*
-> **Purpose**
-> A single document that lets any new user clone the repo, arquire the dataset, recreate the conda environment, and generate the validated Raman pipeline artifacts.
----
-## 1. System Requirements
-| Component | Minimum Version | Notes |
-|-----------|-----------------|-------|
-| Python | 3.10+  | Conda recommended |
-| Git | 2.30+ | Any modern version |
-| Conda | 23.1+ | Mamba also fine |
-| OS  | Linux / MacOS / Windows | CPU run (no GPU needed) |
-| Disk | ~1 GB | Dataset + artifacts |
----
-## 2. Clone Repository
-```bash
-git clone https://github.com/dev-jaser/ai-ml-polymer-aging-prediction.git
-cd ai-ml-polymer-aging-prediction
-git checkout main
-```
----
-## 3. Create & Activate Conda Environment
-```bash
-conda env create -f environment.yml
-conda activate polymer_env
-```
-> **Tip:** If you already created `polymer_env` just run `conda activate polymer_env`
----
-## 4. Download RDWP Raman Dataset
-1. Visit https://data.mendeley.com/datasets/kpygrf9fg6/1
-2. Download the archive (**RDWP.zip or similar**) by clicking `Download Add 10.3 MB`
-3. Extract all `*.txt` Raman files into:
-```bash
-ai-ml-polymer-aging-prediction/datasets/rdwp
-```
-4. Quick sanity check:
-```bash
-ls datasets/rdwp | grep ".txt" | wc -l # -> 170 + files expected
-```
----
-## 5. Validate the Entire Pipeline
-Run the canonical smoke-test harness:
-```bash
-./validate_pipeline.sh
-```
-Successful run prints:
-```bash
-[PASS] Preprocessing
-[PASS] Training & artificats
-[PASS] Inference
-[PASS] Plotting
-All validation checks passed!
-```
-Artifacts created:
-```bash
-outputs/figure2_model.pth
-outputs/logs/raman_figure2_diagnostics.json
-outputs/inference/test_prediction.json
-outputs/plots/validation_plot.png
-```
----
-## 6. Optional: Train ResNet Variant
-```python
-python scripts/train_model.py --model resnet --target-len 4000 --baseline --smooth --normalize
-```
-Check that these exist now:
-```bash
-outputs/resnet_model.pth
-outputs/logs/raman_resnet_diagnostics.json
-```
----
-## 7. Clean-up & Re-Run
-To re-run from a clean state:
-```bash
-rm -rf outputs/*
-./validate_pipeline.sh
-```
-All artifacts will be regenerated.
----
-## 8. Troubleshooting
-| Symptom | Likely Cause | Fix |
-|---------|--------------|-----|
-| `ModuleNotFoundError` during scripts| `conda activate polymer_env` not done | Activate env|
-| `CUDA not available` warning | Running on CPU | Safe to ignore |
-| Fewer than 170 files in `datasets/rdwp` | Incomplete extract | Re-download archive |
-| `validate_pipeline.sh: Permission denied` | Missing executable bit | `chmod +x validated_pipeline.sh` |
----
-## 9. Contact
-For issues or questions, open an Issue in the GitHub repo or contact @dev-jaser

docs/sprint_log.md DELETED Viewed

@@ -1,31 +0,0 @@
-# Sprint Log
-## @model-expansion-preflight-2025-08-21
-**Goal:** Reinforce training script contracts and registry hook without behavior changes.
-**Changes:**
-- Reproducibility seeds (python/numpy/torch/cuda).
-- Optional cuDNN deterministic settings.
--Typo fix: "Reseample" -> "Resample".
-- Diagnostics fix: per-fold accuracy logs use correct variable.
-- Explicit dtypes in TensorDataset (float32/long).
-**Tests:**
-- Preprocess: ✅
-- Train (figure2, 1 epoch): ✅
-- Inference smoke: ✅
-**Notes:** Baseline intact; high CV variance due to class imbalance recorded for later migration.
-## @model-expansion-registry-2025-08-21
-**Goal:** Make model lookup a single source of truth and expose dynamic choices for CLI/infra.
-**Changes:**
-- Added `models/registry.py` with `choices()` and `build()` helpers.
-- `scripts/train_model.py` imports registry, uses `choices()` for argparse and `build()` for contruction.
-- Removed direct model selection logic from training script.
-**Tests:**
-- Train (figure2) via registry: ✅
-- Inference unchanged paths: ✅
-**Notes:** Artifacts remain `outputs/{model}_model.pth` to avoid breaking validator; inference arch flag to be added next.
-## @model-expansion-resnet18vision-2025-08-21
-**Goal:** Introduce a second architecture and prove multi-model training/inference via shared registry.
-**Changes:** `models/resnet18_vision.py` (1D), registry entry, `run_inference.py --arch`.
-**Tests:** Train (1 epoch) -> `outputs/resnet18vision_model.pth`; Inference JSON ✅
-**Notes:** Backward compatibility preserved (`--arch` defaults to figure2).

validate_pipeline.sh DELETED Viewed

@@ -1,67 +0,0 @@
-#!/usr/bin/env bash
-# ===========================================
-# validate_pipeline.sh — Canonical Smoke Test
-# AI-Driven Polymer Aging Prediction System
-# Requires: conda (or venv) already installed
-# ===========================================
-set -euo pipefail
-RED='\033[0;31m'
-GRN='\033[0;32m'
-YLW='\033[1;33m'
-NC='\033[0m'
-die() {
-    echo -e "${RED}[FAIL] $1${NC}"
-    exit 1
-}
-pass() { echo -e "${GRN}[PASS] $1${NC}"; }
-echo -e "${YLW}>>> Activating environment...${NC}"
-source "$(conda info --base)/etc/profile.d/conda.sh"
-conda activate polymer_env || die "conda env 'polymer_env' not found"
-root_dir="$(dirname "$(readlink -f "$0")")"
-cd "$root_dir" || die "repo root not found"
-# ---------- Step 1: Preprocessing ----------
-echo -e "${YLW}>>> Step 1: Preprocessing${NC}"
-python scripts/preprocess_dataset.py datasets/rdwp \
-    --target-len 500 --baseline --smooth --normalize |
-    grep -q "X shape:" || die "preprocess_dataset.py failed"
-pass "Preprocessing"
-# ---------- Step 2: CV Training (Figure2) ----------
-mkdir -p outputs outputs/logs || true
-# Optional: skip gracefully if dataset is not present
-if [ ! -d "datasets/rdwp" ] || [ -z "$(find datasets/rdwp -maxdepth 1 -name '*.txt' 2>/dev/null)" ]; then
-echo -e "${YLW}{SKIP} Training (no datasets/rdwp/*.txt found)${NC}"
-else
-echo -e "${YLW}>>> Step 2: 10-Fold CV Training${NC}"
-python scripts/train_model.py \
-    --target-len 500 --baseline --smooth --normalize \
-    --model figure2
-[[ -f outputs/figure2_model.pth ]] || die "model .pth not found"
-[[ -f outputs/logs/raman_figure2_diagnostics.json ]] || die "diagnostics JSON not found"
-pass "Training & artifacts"
-fi
-# ---------- Step 3: Inference ----------
-echo -e "${YLW}>>> Step 3: Inference${NC}"
-python scripts/run_inference.py \
-    --target-len 500 \
-    --input datasets/rdwp/wea-100.txt \
-    --model outputs/figure2_model.pth \
-    --output outputs/inference/test_prediction.json
-[[ -f outputs/inference/test_prediction.json ]] || die "inference output missing"
-pass "Inference"
-# ---------- Step 4: Spectrum Plot ----------
-echo -e "${YLW}>>> Step 4: Plot Spectrum${NC}"
-mkdir -p outputs/inference || true
-python scripts/plot_spectrum.py --input datasets/rdwp/sta-10.txt
-[[ $? -eq 0 ]] || die "plot_spectrum.py failed"
-pass "Plotting"
-echo -e "${GRN}All validation checks passed!${NC}"