Skip to main content

Social, Computer Scientists Want to Share Data On Group Behavior

Image credit: beachmobjellies. Shared under a Creative Commons license. Click here for more information.

Computer and social scientists have collaborated to develop a large data set on how group behavior and technology influence decision-making – and they want to share that data with other researchers.

“We feel that there are absolutely applications for this data that we haven’t thought of yet, and we look forward to seeing how other researchers make use of it,” says Joann Keyton, a professor of communication at NC State University and co-creator of the dataset, called the Collaborative Interaction Corpus (CIC).

The data stems from an experiment conducted by Keyton and Paul Jones, a computer science Ph.D. student at NC State, designed to evaluate group interaction and how it influences individual decision-making. The work was made possible by the Laboratory for Analytic Sciences (LAS), a research partnership between NC State and the National Security Agency.

In the experiment, which was conducted in 2014, 135 people were asked to select a third-party presidential candidate for the 2016 presidential election. Participants were given internet access and asked to write a draft report explaining how confident they were in their decision and why they made respective choices. Participants were then assigned to groups of two or three people to discuss the task. After the discussion, participants were again allowed to use the internet and asked to write a final version of their reports.

The researchers collected data on all study participants, the discussion in each group, the online activity of each individual during the experiment, and the draft and final versions of their reports. Data being released include logs of study participants’ internet use, over 85,000 screenshots illustrating the material being viewed by study participants, and a video of the screenshots for each group member who independently searched the web for information related to the task. More data will be added to the collection as it undergoes review to ensure that it has been effectively anonymized.

“This experiment generated an enormous amount of data,” Keyton says. “That’s valuable because research on groups is difficult, and we didn’t have to rely on self-reporting from study participants about what resources they used or evaluated when reaching their conclusions – we can actually see what they were looking at online and what input they were getting from other members in their groups.”

“The data are useful for addressing various computer-science questions as well,” says Jones, who is also a researcher with LAS and co-creator of the CIC. “For example, it can be used to address questions related to predicting user behavior or machine learning.

“What we’re doing now is making this dataset publicly available to facilitate research by social scientists, computer scientists or others who can make use of it,” Jones says.

Researchers interested in accessing the dataset can learn more at

“The research community keeps calling for computer scientists and social scientists to work together, but it doesn’t happen very often,” Keyton says. “Well, we did it, and it worked well. Hopefully, other researchers can also benefit from this collaboration.”