SD1117 - duplicate records

s Susan

on September 16, 2013

For TR, it needs to take into account the TRLNKID before deciding its a duplicate.

s Sergiy

on September 17, 2013

Hi Susan, Thank you for your

Hi Susan,

Thank you for your comments!

While SD1117 check is quite useful, it's not perfect in avoiding false-positive messages. As of today the SD1117 check is "hard-coded" to use pre-specified variables. We can continue adding new variables, which define uniqueness of records. E.g., --LINKID, --EVAL, etc. The better solution would be to enhance the check to be domain specific. There are several options how to do it:

Specify key variables for each standard domain in validation configurations. However some study specific variability will be still expected.
Utilize define.xml file where domain key variables are specified by the owner of a study. Some additional checks may be expected to ensure that data a collection process itself does not deviate from standards.

Kind Regards,

Sergiy

s Susan

on September 17, 2013

If the comment went away when

If the comment went away when I included the define in the validation, which clearly indicates that the LNKID is part of the key (although I suspect that v3.1.4 IG will also indicate this for both TU and TR), that would be great, but it still shows up.

n Nitin

on October 8, 2013

Hi Sergiy, Atleast addition

Hi Sergiy, Atleast addition of result variables will help a lot for the time being:

1) TERM/DECOD in EVENTS

2) TRT/CMCLAS in INTERVENTION

3) ORRES/ORRESU in FINDINGS

s Sergiy

on October 8, 2013

Hi Nitin, "Duplicates"

Hi Nitin,

"Duplicates" checks should be domain or even study specific. It's a good example of metadata driven checks.

There is no good universal or rigid algorithm which would use only some pre-specified variables for all domains.

Regards,

Sergiy

n Nitin

on October 9, 2013

Hi Sergiy, as along as any

Hi Sergiy, as along as any one variable (except --SEQ) is different the records are not really duplicates. Maybe reword the check to duplicate 'key' variables (still be study domain specific but can use define to automate) or change the algorithm to check all variables (except --SEQ) in dataset (make it universal).

Thanks.

s Sergiy

on October 10, 2013

Hi Nitin! I've seen many

Hi Nitin!

I've seen many cases with real duplicates. E.g., LB results were entered twice, AEs were collected as on visit assessments with the same start dates, "Not Done" duplicate records, etc.

Sometimes this check produces false-positive messages, but they helps to find out that additional, not well documented variables were used to capture important info.

Regards,

Sergiy

SD1117 - duplicate records

Hi Susan, Thank you for your

If the comment went away when

Hi Sergiy, Atleast addition

Hi Nitin, "Duplicates"

Hi Sergiy, as along as any

Hi Nitin! I've seen many

Want a demo?

Let’s Talk.

Request A Demo

SD1117 - duplicate records

Hi Susan, Thank you for your

If the comment went away when

Hi Sergiy, Atleast addition

Hi Nitin, "Duplicates"

Hi Sergiy, as along as any

Hi Nitin! I've seen many

Want a demo?

Let’s Talk.

Cookie Policy

Request A Demo