AI & ML interests

None defined yet.

SelmaNajih001 
posted an update 4 days ago
view post
Post
2699
How Financial News Can Be Used to Train Good Financial Models 📰
Numbers tell you what happened, but news tells you why.
I’ve written an article explaining how news can be used to train AI models for sentiment analysis and better forecasting. Hope you find it interesting!

Read it here: https://huggingface.co/blog/SelmaNajih001/llms-applied-to-finance

I would love to read your opinions! I’m open to suggestions on how to improve the methodology and the training
  • 1 reply
·
SelmaNajih001 
posted an update 5 days ago
view post
Post
2963
Which is the best model to use as a signal for investment?
Here who is gaining the most:
SelmaNajih001/InvestmentStrategyBasedOnSentiment

The Space uses titles from this dataset:
📊 SelmaNajih001/Cnbc_MultiCompany

Given a news title, it calculates a sentiment score : if the score crosses a certain threshold, the strategy decides to buy or sell.
Each trade lasts one day, and the strategy then computes the daily return.
For Tesla the best model seems to be the regression 👀
Just a quick note: the model uses the closing price as the buy price, meaning it already reflects the impact of the news.
SelmaNajih001 
posted an update 12 days ago
view post
Post
656
How Financial News Can Be Used to Train Good Financial Models 📰
Numbers tell you what happened, but news tells you why.
I’ve written an article explaining how news can be used to train AI models for sentiment analysis and better forecasting. Hope you find it interesting!

Read it here: https://huggingface.co/blog/SelmaNajih001/llms-applied-to-finance

I would love to read your opinions! I’m open to suggestions on how to improve the methodology and the training
  • 1 reply
·
SelmaNajih001 
posted an update 17 days ago
view post
Post
377
Which is the best model to use as a signal for investment? 🤔
I’ve created a Space where you can compare three models:
-Two available on my profile
- ProsusAI/finbert
You can try it here:
👉 SelmaNajih001/InvestmentStrategyBasedOnSentiment
The Space uses titles from this dataset:
📊 SelmaNajih001/Cnbc_MultiCompany

Given a news title, it calculates a sentiment score : if the score crosses a certain threshold, the strategy decides to buy or sell.
Each trade lasts one day, and the strategy then computes the daily return.

Just a quick note: the model uses the closing price as the buy price, meaning it already reflects the impact of the news.
If I had chosen the opening price, the results would have been less biased but less realistic given the data available.
SelmaNajih001 
posted an update 22 days ago
SelmaNajih001 
posted an update 23 days ago
view post
Post
2267
Finally, I uploaded the model I developed for my master’s thesis! Given a financial event, it provides explained predictions based on a dataset of past news and central bank speeches.
Try it out here:
SelmaNajih001/StockPredictionExplanation
(Just restart the space and wait a minute)

The dataset used for RAG can be found here:
SelmaNajih001/FinancialNewsAndCentralBanksSpeeches-Summary-Rag
While the dataset used for the training is:
SelmaNajih001/FinancialClassification

I also wrote an article to explain how I've done the training. You can find it here https://huggingface.co/blog/SelmaNajih001/explainable-financial-predictions

  • 2 replies
·
SelmaNajih001 
posted an update 25 days ago
view post
Post
3385
Introducing a Hugging Face Tutorial on Regression

While Hugging Face offers extensive tutorials on classification and NLP tasks, there is very little guidance on performing regression tasks with Transformers.
In my latest article, I provide a step-by-step guide to running regression using Hugging Face, applying it to financial news data to predict stock returns.
In this tutorial, you will learn how to:
-Prepare and preprocess textual and numerical data for regression
-Configure a Transformer model for regression tasks
-Apply the model to real-world financial datasets with fully reproducible code

Read the full article here: https://huggingface.co/blog/SelmaNajih001/how-to-run-a-regression-using-hugging-face
The dataset used: SelmaNajih001/FinancialClassification
  • 1 reply
·
SelmaNajih001 
posted an update 28 days ago
view post
Post
1653
Introducing SelmaNajih001/StockPredictionExplanation, built with GRPO and RAG:
-GRPO trains the model to predict and explain stock direction.
-RAG grounds explanations in historical financial news and central bank speeches.
Together, they create a system that forecasts stock movements and shows the reasoning behind them.
Full article: Explainable Financial Predictions — https://huggingface.co/blog/SelmaNajih001/explainable-financial-predictions
Try it here: StockPredictionExplanation Space — SelmaNajih001/StockPredictionExplanation
SelmaNajih001 
posted an update about 1 month ago
view post
Post
263
Predicting Stock Price Movements from News 📰📈
I trained a model to predict stock price movements (Up, Down, Neutral) from company news.
Dataset: Articles linked to next-day price changes, covering Apple, Tesla, and more.
Approach: Fine-tuned allenai/longformer-base-4096 for classification.
Outcome: The model captures the link between news and stock movements, handling long articles and producing probability scores for each label.
Comparison: Shows promising alignment with stock trends, sometimes outperforming FinBERT.
Feel free to try the model and explore how news can influence stock predictions SelmaNajih001/SentimentAnalysis
hesamation 
posted an update about 2 months ago
view post
Post
9272
a senior engineer at google just dropped a 400-page free book on docs for review: agentic design patterns.

the table of contents looks like everything you need to know about agents + code:
> advanced prompt techniques
> multi-agent patterns
> tool use and MCP
> you name it

read it here: https://docs.google.com/document/d/1rsaK53T3Lg5KoGwvf8ukOUvbELRtH-V0LnOIFDxBryE/edit?tab=t.0#heading=h.pxcur8v2qagu

you can also pre-order on Amazon (published by Springer) and the royalties goes to Save the Children: https://www.amazon.com/Agentic-Design-Patterns-Hands-Intelligent/dp/3032014018/
hesamation 
posted an update 3 months ago
view post
Post
4011
longer context doesn't generate better responses. it can even hurt your llm/agent. 1M context window doesn't automatically make models smarter as it's not about the size; it's how you use it.

here are 4 types of context failure and why each one happens:

1. context poisoning: if hallucination finds its way into your context, the agent will rely on that false information to make its future moves. for example if the agent hallucinates about the "task description", all of its planning to solve the task would also be corrupt.

2. context distraction: when the context becomes too bloated, the model focuses too much on it rather than come up with novel ideas or to follow what it has learned during training. as Gemini 2.5 Pro technical report points out, as context grows significantly from 100K tokens, "the agent showed a tendency toward favoring repeating actions from its vast history rather than synthesizing novel plans".

3. context confusion: everyone lost it when MCPs became popular, it seemed like AGI was achieved. I suspected there is something wrong and there was: it's not just about providing tools, bloating the context with tool use derails the model from selecting the right one! even if you can fit all your tool metadata in the context, as their number grows, the model gets confused over which one to pick.

4. Context Clash: if you exchange conversation with a model step by step and provide information as you go along, chances are you get worse performance rather than providing all the useful information at once. one the model's context fills with wrong information, it's more difficult to guide it to embrace the right info. agents pull information from tools, documents, user queries, etc. and there is a chance that some of these information contradict each other, and it's not good new for agentic applications.

check this article by Drew Breunig for deeper read: https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.html?ref=blog.langchain.com
  • 2 replies
·
hesamation 
posted an update 4 months ago
view post
Post
5357
in case you didn’t know, Claude now has a developer training course with certificates,

this is better than anything you can find on Coursera.

covers Claude Code, MCP and its advanced topics and even more:

https://www.anthropic.com/learn/build-with-claude
hesamation 
posted an update 5 months ago
hesamation 
posted an update 5 months ago
view post
Post
2786
I really like how this seven-stage pipeline was laid out in the Ultimate Guide to Fine-Tuning book.

It gives an overview, then goes into detail for each stage, even providing best practices.

It’s 115 pages on arxiv, definitely worth a read.

Check it out: https://arxiv.org/abs/2408.13296
hesamation 
posted an update 6 months ago
hesamation 
posted an update 6 months ago
view post
Post
3122
this book actually exists for free, “the little book of deep learning”. best to refresh your mind about DL basics:
> foundations of machine learning
> how models train
> common layers (dropout, pooling…)
> basic intro to LLMs
actually optimized for mobile.

Book: https://fleuret.org/public/lbdl.pdf
hesamation 
posted an update 6 months ago
view post
Post
3005
The best researchers from DeepSeek, OpenAI, Microsoft, and ByteDance explored RL and Reasoning in LLMs,

Here's some of their key findings:

1/ RL can further improve distilled models. These models are essentially SFT fine-tuned with the data generated by larger models, and the SFT+RL combo does not disappoint.

This is verified in the DeepSeek-R1 paper.

2/ both GRPO and PPO algorithms suffer from length bias; they encourage longer responses. This can be tackled by introducing explicit rewards based on the length of the answer.

3/Most reasoning research is focused on code and math. But training models on logic puzzles improves them for mathematical tasks too.

This shows the RL reasoning is generalized beyond the specific domain knowledge.

Previous research also shows RL can be a great generalizer.

4/The reasoning might not be only induced by RL; it might already be hidden in the base models due to the pre-training and CoT data they were trained on.

So while RL does wake up the reasoning beast, maybe it's not the only solution (e.g. other methods such as distillation)

5/ back to the length bias; reasoning models tend to generate longer responses for wrong answers. RL might be the culprit.

RL favours longer answers when the reward is negative, to dilute the penalty per individual token and lower the loss.

This might explain the "aha" moments!

6/ OpenAI's competitive programming paper showed an interesting finding:

o3 can learn its own test-time strategies (like writing an inefficient but correct solution to verify the answer of an optimized solution)

RL helps LLMs develop their own reasoning & verification methods.
The recent article by @rasbt helped me a lot in getting a broad view of the recent research on reasoning models.

He also lists more influential papers on this topic, It's a must-read if you're interested.

check it out 👇
https://magazine.sebastianraschka.com/p/the-state-of-llm-reasoning-model-training