Detta är del 1 i en serie experiment jag publicerar löpande.
Läs även del 2: en lömsk bugg och stora förbättringar!
Del 3: Toppresultat med två klasser!
2008 skapade jag och Jon Kågström en gratistjänst för automatisk klassificering av Myers-Briggs typ av bloggar som heter typealyzer.com. Sedan starten har sidan haft närmare en halv miljon unika besökare. I Augusti 2012 införde vi möjligheten för besökarna att berätta om den automatiska klassificeringen stämde genom att uppge sin Myers-Briggstyp. Nyligen laddade jag ner den survey-datan och det visade sig att 27,959 svar hade inkommit! Jag hade klentroget sagt till Jon att vi väntar tills vi fått in 2000 svar, en siffra som verkade helt osannolik när vi började – sen gick åren och jag mer eller mindre glömde bort filen. Tills nu. Fortsätt läsa ”Kan man förutsäga Myers-Briggs personlighetstyp från bloggtexter?”
Just like words have different meanings depending on context, the positive or negative charge of an expression depends on the belief system of the audience. The Swedish company Gavagai and also Recorded Future that was recently funded by Google and CIA, have a simple and elegant solution to the problem of word meaning and different languages. Judging the sentiment of a text has the same problem to solve. When I started out as a media analyst (once upon a time) one of the things we where trained with was to define and learn to look from the customers eyes – their most important stakeholders eyes to be more exact. A famous example is news about downsizing at a company – positive news from the eyes of the majority of stock holders, but negative news from the eyes of the majority of the employees. When doing the analysis manually you normally need to choose only one of the interpretations for clarity and cost efficiency. When storing large amounts of data is no problem and you´re using computers for the text analytics process it´s possible to get a fuller picture.
The thing is that beauty, of course, lies in the eyes of the beholder – what makes one person thrilled makes another one angry. The underlaying mechanism is the belief system. So, in order to be more flexible in reporting sentiment across different audiences you need to start with analyzing the belief systems of different audiences. With that data available, sentiment analysis would come closer to being useful for people interested in measuring, evaluating and predicting the spread of ideas. Which is kinda what everyone from well-funded counterterrorism intelligence analysts to individual power bloggers are really interested in, in this medialized society.
When I started out with psychographic text analysis a few years ago sentiment analysis was the craze. If I told anyone about the idea to analyze the psychological traits of someone online most (action oriented) people directly asked me about if I could do sentiment analysis. Some people still do. The thing is that I´m tremendously more interested in the psychological state of a person, e.g. the mood, than what the opinions they express about something. Today, finally, psychological text analysis is starting to come on the radar for a broader audience primarily due to the hedge fund Derwent Capital´s innovative approach to financial prediction using mood analysis of Twitter and sociologists researching the correlation between time of the day and mood on Twitter. Fortsätt läsa ”The future of social media profiling: now your mood, then your values and personality”
A tweet by Bobby Bear has gotten retweeted and Bubba Bear has been wondering if there is some pattern behind what tweets get retweeted. That leads them to psychologists and text analysts such as James W Pennebaker and social media scientists such as Dan Zarella. They even mention yours truly!
Fortsätt läsa ”Video: Cute Philosopher Bears on retweets”