LLM trading - sentiment analysis comparison #61

kaidatavis · 2023-11-20T23:31:33Z

No description provided.

kaidatavis · 2023-11-20T23:34:16Z

Sentiment benchmark

(please create a table to compare the different options following the format in #59 and #60)

kaidatavis · 2023-11-20T23:37:17Z

Sentiment performance of different LLMs

See #62 for a list of LLMs.

Please create a table to compare the different options following the format in #59 and #60.

Please fill the table with 1) information and 2) a link to the source if available.

erwin27 · 2023-11-22T18:18:13Z

LLMs Benchmark on Financial News Dataset

Test dataset: Financial Phrasebank
English sentences from financial news, and classified as either positive, negative, or neutral by researchers knowledgeable in the finance domain.

This benchmark using 1,000 sample records (random seed = 42) from "sentence_allagree" subset.
The distribution of the test set is same as the subset population (Neutral: 61.4%, Positive: 25,2%, Negative: 13,4%)

dataset example:

Sentence	Label
The mall is part of the Baltic Pearl development project in the city of St Petersburg , where Baltic Pearl CJSC , a subsidiary of Shanghai Foreign Joint Investment Company , is developing homes for 35,000 people .	1 neutral
In the reporting period , net sales rose by 8 % year-on-year to EUR64 .3 m , due to the business acquisitions realized during the first half of 2008-09 , the effect of which was EUR10 .9 m in the review period .	2 positive
Pharmaceuticals group Orion Corp reported a fall in its third-quarter earnings that were hit by larger expenditures on R&D and marketing .	0 negative

LLMs Model tested are from gpt4all.io (4-bit Quantization)

Hardware (Laptop) specs used for test:
Windows 11
CPU: 12th Gen Intel i7-12700H 4.70 GHz
RAM: 32GB 4800Mhz DDR5
GPU: Nvidia RTX 3070 Ti 8Gb
Storage: 1TB M.2 SSD

Testing Environment
Python 3.11.5
gpt4all 2.0.2
torch 2.1.1+cu121

Zero-Shot Benchmark Result (on progress) Result Files

Model	Accuracy	Run Time	avg time/iter	Spec n requirement: Param/size/RAM
gpt4all-13b-snoozy-q4_0	0.3670	5hr 37min	20.28s	13b/6.86 GB/16 GB
nous-hermes-llama2-13b.Q4_0	0.6460	5hrs 42min	20.52s	13b/6.86 GB/16 GB
wizardlm-13b-v1.2.Q4_0	0.8170	5hr 36min	20.18s	13b/6.86 GB/16 GB
orca-2-13b.Q4_0	0.8820	5hrs 43min	20.56s	13b/6.86 GB/16 GB
orca-2-7b.Q4_0	0.5050	2hrs 53min	10.37s	7b/3.56 GB/8 GB
mistral-7b-openorca.Q4_0	0.6060	3hrs 1min	10.88s	7b/3.83 GB/8 GB
mistral-7b-instruct-v0.1.Q4_0	0.5520	3hrs 5min	11.10s	7b/3.83 GB/8 GB
gpt4all-falcon-q4_0	0.1630	2hrs 49min	10.13s	7b/3.92 GB/8 GB
mpt-7b-chat-merges-q4_0	0.1340	2hrs 31min	9.07s	7b/3.54 GB/8 GB
orca-mini-3b-gguf2-q4_0	0.3990	1hr 27min	5.25s	3b/1.84 GB/4 GB

kaidatavis · 2023-11-22T20:09:00Z

@erwin27, this is a very good start. Please keep filling the table as more results become available.

Did you have a chance to discuss with Xiruo about distribute the work? Maybe she can test some of the models while you are testing the others?

Finally, it will be good to test the 'strength' of the sentiment besides positive, negative, and neutral. For example, there can be 'very positive' and 'a little positive', which translate to 1.8 or 1.2 sentiment score if 2 is the most positive.

erwin27 · 2023-11-22T20:45:54Z

@kaidatavis , yes me and Xiruo have discussed several times already since monday. Sure, we will continue until the rest of the model and may be finding another financial related dataset if possible. We also will try your recommendation for the sentiment analysis strength as that would be huge impact for the project if latter we implement it. Thank you @kaidatavis

kaidatavis added the Trading label Nov 20, 2023

kaidatavis assigned kaidatavis, raulfrk, erwin27 and alyxs25 and unassigned kaidatavis Nov 20, 2023

kaidatavis added LLM 2023-2024 labels Nov 20, 2023

kaidatavis changed the title ~~LLM trading - sentiment analysis~~ LLM trading - sentiment analysis comparison Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM trading - sentiment analysis comparison #61

LLM trading - sentiment analysis comparison #61

kaidatavis commented Nov 20, 2023

kaidatavis commented Nov 20, 2023

kaidatavis commented Nov 20, 2023 •

edited

Loading

erwin27 commented Nov 22, 2023 •

edited

Loading

kaidatavis commented Nov 22, 2023

erwin27 commented Nov 22, 2023 •

edited

Loading

LLM trading - sentiment analysis comparison #61

LLM trading - sentiment analysis comparison #61

Comments

kaidatavis commented Nov 20, 2023

kaidatavis commented Nov 20, 2023

Sentiment benchmark

kaidatavis commented Nov 20, 2023 • edited Loading

Sentiment performance of different LLMs

erwin27 commented Nov 22, 2023 • edited Loading

kaidatavis commented Nov 22, 2023

erwin27 commented Nov 22, 2023 • edited Loading

kaidatavis commented Nov 20, 2023 •

edited

Loading

erwin27 commented Nov 22, 2023 •

edited

Loading

erwin27 commented Nov 22, 2023 •

edited

Loading