Papers
arxiv:2510.14242

Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs

Published on Oct 16
Authors:
,
,

Abstract

Flip-Flop Consistency ($F^2C$) enhances Large Language Model (LLM) robustness to prompt variations through unsupervised training, improving agreement, performance, and generalization.

AI-generated summary

Large Language Models (LLMs) often produce inconsistent answers when faced with different phrasings of the same prompt. In this paper, we propose Flip-Flop Consistency (F^2C), an unsupervised training method that improves robustness to such perturbations. F^2C is composed of two key components. The first, Consensus Cross-Entropy (CCE), uses a majority vote across prompt variations to create a hard pseudo-label. The second is a representation alignment loss that pulls lower-confidence and non-majority predictors toward the consensus established by high-confidence, majority-voting variations. We evaluate our method on 11 datasets spanning four NLP tasks, with 4-15 prompt variations per dataset. On average, F^2C raises observed agreement by 11.62%, improves mean F_1 by 8.94%, and reduces performance variance across formats by 3.29%. In out-of-domain evaluations, F^2C generalizes effectively, increasing F_1 and agreement while decreasing variance across most source-target pairs. Finally, when trained on only a subset of prompt perturbations and evaluated on held-out formats, F^2C consistently improves both performance and agreement while reducing variance. These findings highlight F^2C as an effective unsupervised method for enhancing LLM consistency, performance, and generalization under prompt perturbations. Code is available at https://github.com/ParsaHejabi/Flip-Flop-Consistency-Unsupervised-Training-for-Robustness-to-Prompt-Perturbations-in-LLMs.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.14242 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.14242 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.14242 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.