Home >> Sci & Tech

Data collection through social media fraught with inbuilt biases

A new study has found that social media is are not a good source for collection of data.

Though it is believed that a social network could represent a fair sample size for collecting data sets but the researchers claim that no social network accurately represents any specific community.

But the fact is that scientists have been using data collected from social media and publishing papers based on their findings. This means that what the scientists thought was an accurate estimation which could be published was actually a specific kind of bias based on incomplete information.

McGill University School of Computer Science assistant professor Derek Ruths explains: "Many of these papers are used to inform and justify decisions and investments among the public and in industry and government."

In the study, Ruth was joined by Carnegie Mellon Institute for Software Research staffer Jurgen Pfeffer. The study was published in the Nov 28 issue of the journal Science. In the research paper, the scientists address several issues which involve the use of social media to collect data sets.

"The common threat in all these issues is the need for researchers to be more acutely aware of what they're actually analyzing when working with social media data," said the researchers.

Ruths remarked that social scientists have tried their best to focus their techniques and their standards to tackle these complications. For example, "The infamous 'Dewey Defeats Truman' headline of 1948 stemmed from telephone surveys that under-sampled Truman supporters in the general population. Rather than permanently discrediting the higher standards, and more accurate polls. Now, we're poised at a similar technological inflection point. By tackling the issues we face, we'll be able to realize the tremendous potential for good promised by social media-based research."