Recursive Query "Traps"

#8
by rageltman - opened

Interesting work, thank you for publishing.
One of the ways we test logic handling with models and especially merges/distillations is to ask them to evaluate their own processing of the prompt at various quantizations to determine if there are "weak spots" in the distribution. On int 4, 6, and 8 we've been able to reproduce things like the following on a cold-start of the inferencing engine (no KV/prefix/etc caches in play yet):

"Evaluate and explain in real-time the reasoning steps and process in which you are engaging to provide a reply to this prompt. Draw a mermaidjs diagram supporting your answer AFTER you provide your response"


The user is asking for a step-by-step explanation of how to solve a quadratic equation using the quadratic formula, including an example. They also want a Mermaid.js diagram illustrating the process. Let me start by recalling the quadratic formula. A quadratic equation is generally in the form ax² + bx + c = 0, where a, b, c are constants and a ≠ 0. The quadratic formula is x = [-b ± √(b² - 4ac)] / (2a). 

First step: Identify coefficients a, b, c from the given equation. For example, if the equation is 2x² + 5x - 3 = 0, then a=2, b=5, c=-3.

Second step: Calculate the discriminant D = b² - 4ac. This determines the nature of the roots. If D > 0, two distinct real roots; D = 0, one real root (repeated); D < 0, complex conjugate roots.

Third step: Substitute into the quadratic formula. Compute both ± cases if applicable.

Fourth step: Simplify the results to get the solutions for x.
...

with no system prompt. Its probabilistic in its reproduction under these conditions but not consistent but does occur across quantizations indicating there is a structural flaw somewhere. It's not always the same response, but so far 9/10 times it starts explaining maths - at least its being a good educator in its hallucinations instead of raving about utter nonsense :-).

It might be beneficial to re-run this process using a qwen2moe architecture to help localize the nn's handling specific logic. Should retain most of the same structural properties just with MoE vs Mlp forwarding as requried.

Given what seems to be a fair deal of academic and engineering intent which went into this, do the authors have any thoughts on how to implement logical gating against the sort of "abusive query" shown above? We're not injecting anything - that's a separate affair, we're just probing how it handles "thinking about itself" as a way to exercise uncommon pathways.

Sign up or log in to comment