As countless political orators have demonstrated, it’s not just what you say, it’s how you say it. Using automated text analysis, Cornell researchers have identified an array of features that can make a message more likely to get attention.
They tested their ideas on Twitter, where their computer algorithm predicted more accurately than human observers which version of a tweet would be retweeted more. The results might be applied to longer forms of discourse, from essays to getting your idea accepted in a committee meeting. “We’re looking at persuasion everywhere,” said Lillian Lee, professor of computer science and information science.
Twitter enabled the researchers to conduct a controlled experiment to eliminate the effects of the popularity of the poster or the topic: Many posters will tweet on the same topic more than once, with different wording. The researchers collected and compared thousands of these pairs, and after taking into account the effect of repetition, the experiment showed that wording still matters.
Lee, graduate student Chenhao Tan and Google researcher Bo Pang, Ph.D. ’06, reported their results in the June issue of the Proceedings of the Association for Computational Linguistics. Lee also will describe the work as one of several examples of computer language analysis during a symposium, “The Linguistics of Status, Influence, and Innovation: A Computational Perspective,” Feb. 15 at the annual meeting of the American Associatioin for the Advancement of Science in San Jose, Calif.
On Twitter, many posts are links to websites the poster thinks are important. “You want to say something about it to make people look at it,” Lee explained. Previous social science research has identified several features of an argument that might make it more effective.
The Cornell researchers tested each one separately, then combined them into their algorithm.
The computer looks for the occurrence of certain keywords and compares “bigrams” – combinations of two words. Such combinations may reflect a linguistic style. “Cornell,” for example, might often be followed by “Chronicle” within the campus community, but on Twitter you might find “Cornell research,” “Cornell students,” “Cornell ornithologists.”
The researchers found these features most likely to generate retweets:
- Ask people to share. Words like ”please,” “pls,” “plz” and, of course, “retweet” were common in successful messages.
- Be informative (often measured by length).
- Use the language of the community, and be consistent with the language you usually use yourself, with which your followers are familiar. The researchers are also testing on Reddit, where users form distinct communities.
- Imitate the style of newspaper headlines. (In their tests, the researchers used the New York Times as a model.)
- Use words that appear often in other retweeted messages.
- Use words that express positive or negative sentiment.
- Refer to other people, not just yourself. Use third person pronouns.
- Use generalizations. Statements that can be applied to a variety of situations are the most often repeated.
- Make it easy to read. The researchers applied a formula used to measure the grade level of a text.
“We would love to capture amusingness or cleverness, but we haven’t found a way to do that yet,” Lee added.
Meanwhile, the researchers have created a website where you can see what the algorithm thinks of your own tweets. The researchers conclude with a challenge: Social scientists should try to find out why these tactics work. The research was supported by the National Science Foundation and Google.