Forums: Validation Rule Suggestions
Hi Amelia,
According to the SDTM IG those variables are subjects to the SDTM Control Terminology C66742 (NY) “No Yes Response” Codelist. The list is not extensible and includes “NA” value.
In general the variable limitation checks should be something like “The length of a variable which is under a CDISC Control Terminology non-extensible codelist must be equal to a maximum value length of the codelist”.
There are two separate goals of such rules:
1. Avoid unnecessary artificial expansion of dataset size due to unused long variable lengths.
The use of “UP TO MAXIMUM”, rather than “EXACT length” value is complete valid there and is even more efficient.
2. Keep consistency in lengths of similar variables across datasets and studies.
This is needed to avoid any potential programming errors during data merge processes (e.g., in SAS). It can be done by the use of “EXACT length” approach.
Regards,
Sergiy Sirichenko
Hi Sergyi,
Thank you for your fast reply.
From your answer I see that I made a confusion.
I thought that the rule expects that these SEND variables contain 2 chars instead of one (and "Y" is considered not valid). But actually the rule expects that the SEND variable has always alocated internally in .xpt 2 chars - even though it contains only one (like "Y" or "N").
The same way we expect that --TESTCD is always represented on exactly 8 characters, --TEST always on 40, etc...
I apologize for the confusion and I thank you again for your help.
Best Regards,
Amelia
Yes, it's not about the length of variable value, but about a SAS variable lenghth itself.
The variable value can have 0, 1 or 2 characters. In most cases actual collected values are "Y" or "" (empty).
Hi there,
Was this rule changed back to the previous version? I just downloaded the tool from http://www.opencdisc.org/downloads/opencdisc-validator-1.3-bin.zip and I get: "
SD1054 Non-recommended variable length The length of flags variables (----FL, --FAST, --OCCUR, --PRESP) is expected to be 1"
I thought it was changed to 2 - see the above discussions.
Best regards,
Amelia
Hi Amelia,
At this moment the SD1054 check is not changed. The variable length is expected to be 1 Char. However it's still under discussions if we need to change var length to 2 Chars or completely rid off this check.
According to SDTM IG specs --FL, --FAST, --OCCUR and --PRESP variables are objects for CDISC CT (NY) codelist, which includes Y, N, U, NA values. Therefore the max length of (NY) codelist values is 2 Chars.
However SDTM IG also specifies all expected values for discussed variables in CDISC Notes. E.g., on page 76 about CMPRESP: "Used to indicate whether (Y/null) information about the use of a specific medication was solicited on the CRF." As a result the only permissible values for our variables are Y, N, null and they have the maximum length of 1 Char. E.g., NA value is not expected to be used there by SDTM IG.
It's quite confusing. I am not sure if there is a significant value to enforce such business rules. We have also developed another type of checks for variable length to be introduced in the next release. Those checks will be more generic and work based on un-used space in variables. They are more flexible and adoptive to particular users implementation and real study data. Should we replace all current checks with pre-specified variable length with new checks?
Thank you,
Sergiy
Hello there,
After checking out the latest version of config-send-3.0.xml from SVN, I get the following warning: "SD1054" - "Non-recommended variable length" for variables like BWBLFL, BWFAST, BWEXCLFL, CLEXCLFL.The warning's description is: "The length of flags variables (----FL, --FAST, --OCCUR, --PRESP) is expected to be 2".
But the values for those type of variables (that are controlled by the CL.C66742.NY code list) are "Y" or null.
A previous version of rule SD1054 was checking that the variables have the length "1", but the rule was recently updated to accomodate values with 2 characters (I think the reason for this was discussed on this forum some while ago...).
Maybe this rule should check that the variables have MAXIMUM 2 chars (so that both the previous and the current fucntionality of the rule are kept)?
Thanks!
Amelia