BEYOND NETWORKING: Twitter analysis reveals global human moodiness

Cornell University social scientists use Twitter and Hadoop to study human behaviour

Twitter, Facebook and other social media sites are often criticized for encouraging people to share thoughts of little consequence, though social scientists are finding these electronic missives, when assembled en masse and analyzed with big data tools, can offer a wealth of new information about how people think and act.

A pair of researchers from Cornell University are the latest to mine social networks for such academic insight. Scott Golder and Michael Macy analyzed 509 million Twitter messages emitted over a period of two years by 2.4 million users across 84 different countries. From this data, they have gleaned that people have the same daily cycle of moods, regardless of their culture or language.

A paper summarizing the work, "Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures," is in Thursday's issue of the journal Science.

Beyond the immediate results, the work points to a possible new path in academic research, that of mining social networks for academic insight, the researchers said.

"The paper illustrates the opportunities for doing social and behavioral science in a new way," Macy said. "The growing tendency for human beings all over the globe to interact with one another using digital devices opens opportunities for research that were unimaginable even five years ago."

"Far from seeing conversations as mundane and useless, we see value in the fact that they are real-time, time-stamped messages produced by people on the spot for sharing with their friends," Golder added.

The researchers used Twitter's API (application programming interface) to gather the messages. They set up a six-node cluster to extract the data, which arrived packaged in XML, and converted the results into flat files. They then used a 55-node Hadoop cluster, running at the Cornell University Center for Advanced Computing, to analyze the dataset.

The analysis tool they used, Linguistic Inquiry and Word Count, links specific words to various positive and negative moods. Messages that might include words such as "awesome," "fantastic" or "pretty" could indicate a positive state, whereas words like "remorse," "abandonment," "fear" or "fury" indicate a negative state of some sort.

The results showed people tend to be more chirpy in the morning and during weekends. The messages revealed that they wake up happy and slowly grow more disgruntled and sour as the day goes on, though their affect usually rebounds in the evening. This behavior happens on both weekdays and weekends, though the weekend tweets usually start about two hours later than the weekday ones, perhaps because people are sleeping in.

Even in countries where the weekend is not Saturday and Sunday (in the United Arab Emirates, for instance, people work Sunday through Thursday), these patterns were evident.

While the findings may seem obvious, the Cornell work is actually the first full-scale study to show this behavior in empirical form, the researchers contend. Twitter proved valuable in this study because it captures the affect of the individual in real time, Golder said. Typically, clinical studies are done either by bringing subjects to a lab and watching their behaviors -- an unnatural environment for studying day-to-day activities -- or surveying them, an approach limited by fallibility of the subjects' memories.

Also, some subjects "are just not very good at being aware of what their feelings are," Golder said. "It's a big advantage to access people's words in a setting that is natural and spontaneous."

The work, funded by the U.S. National Science Foundation, was done under a research group led by Macy, which combines sociologists and computer scientists to pursue computer-assisted social science research projects. Golder's background is in linguistics and computer science. Prior to joining Cornell, he worked as a research scientist for Hewlett-Packard.

The project "required some engineering know-how, and that will be something that will have to be more and more important for empiric social sciences," he said.

Other parties have also been investigating this new technique of analyzing human activities through the quantitative analysis of their written output, a practice some scholars call culturonomics. In 2010, Google Labs launched a text analysis tool that allows researchers to execute numerical text analysis against Google's massive store of digitized books, which dates back centuries.

This week, Google incorporated the tool, called NGram Viewer, directly into its Google Books service.

Also this week, Twitter has released the source code for its Storm stream processing engine, data analysis software that should help researchers and other users analyze multiple Twitter feeds as they are updated.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags popular scienceInternet-based applications and servicesapplicationssoftwaredata miningsocial networkingtwittersocial mediaCornell Universityinternetdata warehousing

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

David Coyle

Brother PocketJet PJ-773 A4 Portable Thermal Printer

I rate the printer as a 5 out of 5 stars as it has been able to fit seamlessly into my busy and mobile lifestyle.

Kurt Hegetschweiler

Brother PocketJet PJ-773 A4 Portable Thermal Printer

It’s perfect for mobile workers. Just take it out — it’s small enough to sit anywhere — turn it on, load a sheet of paper, and start printing.

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?