Posted on January 1, 2017 at 12:00 PM
There is a dataset of 10,000 tweets from important stock brokers, we want to identify the tweets which are relevant for stock market.
ApproachA mixed approach combining lexical based method with machine learning based methods. SMOTE is used for class imbalance problem, WordNet is used to grow the lexicons.
ResultF score has increased from 0.69 to 0.98
Current StatusWe are on the process of submitting this in Journal.
Next StepIdentify Industry specific and stock specific actionable insights from the relevant stocks.
Team Members