Spaces:

abdullahmeda
/

detect-ai-text

Running

App Files Files Community

abdullahmeda commited on Jan 24, 2024

Commit

6abdf31

verified ·

1 Parent(s): 08698df

Update app.py

Browse files

Files changed (1) hide show

app.py +5 -5

app.py CHANGED Viewed

@@ -10,14 +10,14 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
 print("Loading model & Tokenizer...")
-model_id  = 'gpt2-large'
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model     = AutoModelForCausalLM.from_pretrained(model_id)
 print("Loading NLTL & and scikit-learn model...")
 NLTK = nltk_load('data/english.pickle')
 sent_cut_en = NLTK.tokenize
-clf = joblib.load(f'data/gpt2-large-model')
 CROSS_ENTROPY = torch.nn.CrossEntropyLoss(reduction='none')
@@ -126,9 +126,9 @@ with gr.Blocks() as demo:
         Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
         texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle. Fork of and credits to this github repo
-        Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
-        Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
-        Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
         ### Linguistic Analysis: Language Model Perplexity
         The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \

 print("Loading model & Tokenizer...")
+model_id  = 'gpt2'
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model     = AutoModelForCausalLM.from_pretrained(model_id)
 print("Loading NLTL & and scikit-learn model...")
 NLTK = nltk_load('data/english.pickle')
 sent_cut_en = NLTK.tokenize
+clf = joblib.load(f'data/gpt2-small-model')
 CROSS_ENTROPY = torch.nn.CrossEntropyLoss(reduction='none')
         Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
         texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle. Fork of and credits to this github repo
+        - Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
+        - Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
+        - Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
         ### Linguistic Analysis: Language Model Perplexity
         The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \