How to Avoid Potentially Duplicated Participants?

I have some suspicious participants who seem to make multiple accounts and complete a study simultaneously.

Specifically, I found 5 participants joined my study with 3 minutes of differences and completed the study almost same time. I wouldn’t feel suspicious about this coincidence in general, but these participants spent longer than 1 hour for a less-than-30-minute survey. So, I sent the message if they had any troubleshooting during the study, and I got very similar messages from all 5 people (“Hi! I am sincerely sorry because I had to spend extra time to complete the survey as my net speed was slow. Thanks”). Even the delivery timing of the messages was almost the same. Participants’ self-reported age is pretty similar, and even open-ended answers are also similar. So, this got me to wonder whether they are all same people.

Also, could you give me advice on how to avoid this kind of situation in the future? I am new to Prolific, and do not want to encounter similar problems that would hurt my data result.

I had a similar issue recently too that I noticed due to similar to each other (yet outlier) free response question responses, specifically in anser to"who is a person currently living that you respect /admire/" they or she or he responded “my cat” “my dog” no other respondents had given a non human. I also noticed that when I arranged the respondents in order according to Prolific ID number these responses grouped together suggesting perhaps that the accounts were opened at the same time. Are the suspects IDs similar in your case too?

All the same though, it is only a hunch, and it could well be a coincidence. Prolific are using checks of IP addresses and and perhaps other things to attempt to prevent multiple submissions. It seems to me that a VPN subscription could allow unscrupulous persons to select multiple different IPs. I suspect that Prolific are doing all they can.

I don’t know how to prevent it.

One idea, if you are using Google forms, would be to use settings “Limit to 1 response” per person. This requires that people log in which separate Google accounts. But if they have gone to the trouble of creating separate Prolific accounts they may have made separate google accounts too but this is getting more difficult since Google requires that each account have a separate phone number.

" If you want to create a new Gmail account, Google may ask you for a phone number verification** . This was optional in the past, but recently Google has made it mandatory."

But no, there is a way of getting around this, e.g. by saying that one is below the age where one have a phone.

I may include free response questions in all future surveys to check for response similarity.

Another thing I notices was a poor English ability in supposedly English as a first language speakers (a super hero they/he/she liked was “Captain American” ) which might covary with greater economic need, and motivation to create multiple accounts or is that prejudice, I am not sure.

The only other things you might do are

  1. To ask Support to look into it. They might look to see if the accounts have been used on the same studies in the past. This should be the giveaway since their is usually such a rush to join studies before they fill up, such that if the same 5 accounts were repetitively and simultaneously successful, they would need to be the same person or in a close, close relationship with each other which may conflict with their supposedly separate IP addresses and corresponding geographic locations (mine where in the US and Canada).

  2. To dump the suspect data. Compared to many or all other sources of data, Prolific is imho still inexpensive even if we do have to trash some responses.

I am sorry I can’t be of more help. If anyone has suggestions I would be interested to hear.

2 Likes

Hi @Ashley_Lee & @timtak ,
I have noticed some patterns as well, though I haven’t dug through enough to really evaluate whether people have multiple accounts. My attitude is that this is online research, and will therefore be prone to a bit of sampling error, regardless of how many safeguards are put in place. There will probably be people who spend a lot of time trying to game the system and get multiple accounts. Major problems might arise if this becomes automated and there are many, many accounts that are completing studies algorithmically, which I haven’t seen signs of yet. So, if you notice some weird patterns that make you think it’s the same account, maybe just exclude the data from analyses.

I’ve also noticed that people who are ostensibly proficient in English have less-than-optimal English language use (both via message and in some survey responses). Since language proficiency is self-reported, again, there could be some error here. We had a couple of folks ostensibly completing from a non-US country for a US-representative sample as well.

Maybe Prolific will be able to improve the system over time, but some people will always be motivated to get around the system. For example, there could be some minimal language proficiency tests, but there would probably be algorithms capable of passing the tests.

Basically, my advice would be to plan for the situation. Increase your sample size and be on the lookout for strange patterns. Plan for a little bit of sampling error and know the limitations of online research.

1 Like

@timtak @paul

Thank you for your replies! I’ll take your advice and pay more attention to the survey results. Also, I haven’t thought about English proficiency issue, and it is good to keep in mind of that too. Thanks!

PS. What is a general rule in Prolific for paying participants whose data is low-quality? Should researchers pay for participants for their study completion regardless of their survey response quality?

Thanks again!

Hi @Ashley_Lee

It is very difficult to tell whether the participant’s poor quality data is due to so, yes, I think that one is recommended or required to pay even for poor quality data so long as the participants do not fail “fair attention checks.”

I include “fair attention checks” (as defined above) and less easy to spot
attention checks, such as “I help my colleagues by providing them with
snot filled documentation” and pay for, but do not use, data that affirms
a couple of the latter type of questions. Participants sometimes think it
is a misprint and interpret in other ways.

Tim

1 Like

@timtak Thanks for sharing the link! I will add similar attention checks. :slightly_smiling_face:

1 Like