Ever wondered what we do to maintain data quality, how we deal with bots, or how we manage participant trustworthiness? Wonder no more! Ask us anything about how we do what we do, and our data-team will be on hand to answer!
If you can’t make the AMA tomorrow, but you have questions, ‘reply’ to this thread. Our data team will answer your questions when the event starts
Rightyho, to kick us off!!
Welcome to the first prolific DQ AMA on the new forum
Obviously data quality at Prolific is a big deal for us! Many of us are researchers ourselves, and have had the arduous experience of having to manually go through survey data trying to figure out who took your task/questions seriously. Even if you have clear criteria it can still be disheartening. I’ve not actually run a study on MTurk, but I’ve heard one can sometimes have 25% rejection rates? Must be a considerably logistical challenge!
Anyway, at Prolific we want to reduce the need for manual cleaning of data and rejections as much as possible, and we do this by ensuring the baseline quality of our participants is high. Our focus at the moment is on identifying (and often removing) participants who are fraudulent. I.e, participants lying about their demographics/location, participants running duplicate accounts, and participants using bots to assist in the completion of surveys. We have a three layered approach to this (happy to go into any of these in more detail if you like)
- Technical checks
- Behavioural checks
- Researcher feedback loop
We use a mix of automated and manual processes to catch the suspicious users as early as possible. And are constantly improving and refining our approach to make things better.
We’ve made two big steps forward on quality recently:
- Introducing ID checks for new participants
- Making it easier for researchers to report participants who they’re not happy with, but that they don’t want to reject.
The early evidence is that ID checks make it a lot easier to detect duplicate accounts.
And we’re hoping to tighten the researcher feedback loop with the improved reporting system: we want to calibrate our detection system against the best possible ground-truth.
Hey @Jim_Lumsden, this is something that came up about data quality during our beta test, and I think it would be great to include here: Researchers have noted that the data quality on Prolific is higher than that on MTurk or from panel providers such as Qualtrics - can you explain why you think this is?
Hey @Hannah_Kay , another question that was submitted in our beta: “For prescreeners (say: native language, bilingualism, etc.), do you rely on self-reported information or are you “validating” this information (e.g. with some language proficiency test)?”
Oh man, I think there’s a few different elements to this. So…
- As far as we know, there’s not really any active pool management on Mturk. Their emphasis is very much on “open access” and workers can take part from all around the world with little in the way of checks. This means it’s a bit of a free for all, with duplicate accounts, or bots, or people using VPN to appear as though they’re from somewhere they’re not (CR has a good paper on this, look at the Brinjal question). There’s also intense competition for HITs, and you can accept multiple tasks at once, so Mturkers often take part in multiple surveys/jobs at once. (Pretty good paper on this here)
You can build lots of data-quality checks into your MTurk survey which will help. But at SPSP last year we heard of rejection rates of ~25-50% on MTurk. Which sounds like a minefield (like I said above)
- With Qualtrics, I’m fairly sure they don’t control their own panel. They like use another panel aggregator like Cint or Dynata? That means there’s at least 3 layers between the researcher and the respondant: Researcher → Qualtrics → Dynata → Smaller panel company → Respondent . This makes it hard to feedback on the quality of participants (without a feedback loop, how are you going to detect bad participants?) + everyone’s just working with a much smaller piece of the puzzle. We also know that smaller traditional panels (such as those used by Dynata) don’t offer good rewards (some even pay only in points or prizes so it’s hard to control incentive ). And we do know that incentive is strongly associated with quality of output in crowdsourced research
I think a big factor in the higher data quality at Prolific is that we have full control over the pool, but also that the design of the product limits incentives for malingers. It’s hard to create duplicate accounts, you can’t take part in more than one survey (HIT) at a time, and we have a tight feedback/reporting loop so e can take action quickly.
I’ll dig out the links that I’m missing
At the moment, we are just using self-report for our prescreening data.
We do have some related checks including:
- Monitoring likely inconsistencies between country of birth and first language
- Monitoring the language used in the free text response of our demo study
As such, our “First language” screener is probably one of the most accurate prescreeners (and commonly used by our researchers).
We also have a team working to improve the prescreening process for both participants and researchers. This could make it much easier for us to introduce further validation checks in the future!
One more question from our beta-testers: "Do you know how many participants are actually “driving” the study participation? i.e. do you have some kind of Gini coefficient that says “X% of the participants (users) are responsible for Y% of all study participations” @Hannah_Kay
This is something that we’re really interested in monitoring. We blogged a little on this last year - we found the 5% most active participants complete 20% of the responses on Prolific (and this has stayed consistent - it is the same level right now).
We actively want to discourage the idea of ‘super participants’ - we use an adaptive rate limiting tool that gives priority access to participants who’ve spent less time taking studies recently.
I’m interested in running an international in ethnic study recruiting people from Indian, China, and Africa as well as the UK. I noted in your demographics you list the countries where the participants are based (UK, USA, Poland etc…), but I didn’t see Indian, China I wondered if there were some in the Other category? If that is doable woudl it be possible to get a quote that I can use in the draft of my grant?
Hi Trina, unfortunately our service is only open to participants who live in OECD countries. Since China and India are not in the OECD we’re not able to offer participants from there. Sorry I don’t have better news!