j John
on

 

Hi all,

I found that when performing dataset validation alone, a variable label mismatch with the label defined in the configs does not trigger an error in the validation report. However, when performing validation that includes the Define-XML, the DD0136 error is triggered.

Upon reviewing the validation rule list, it was found that SD0063 has been removed in SDTMIG v3.3, resulting in the inability to detect variable label mismatches in the dataset unless Define-XML is included. I would like to inquire about the reason for the removal of SD0063 from SDTM IGv3.3 and where I can find the change history of the validation rules.

Thank you.

 

Kind regards,

John

Forums: SDTM

j Jozef
on June 11, 2024

The reason is simple ...
The rule that the variable label must be exactly as provided in the SDTMIG has been relaxed as of SDTMIG-3.3.
It must however still correspond to what is present in the define.xml (2.x) under ItemGroupDef/Description/TranslatedText.

So, if the define.xml is not included, there is nothing to check against ...

With best regards,
Jozef

j John
on June 11, 2024

Thank you Jozef.

Define-XML is one of the essential components in the package, and as you mentioned, it is ultimately necessary to check if the variable labels in the dataset are consistent with the SDTMIG standard. I'm wondering why omit this step during the dataset validation phase, which resulted in the mismatch issue being detected only after the completion of the Define-XML..

Kind regards,

John

j Jozef
on June 12, 2024

Thanks!
I always say (and that's also what I "preach" when I am giving the Define-XML trainings for CDISC):
"The define.xml is the sponsor's truth about the submissions".
So, yes, also i.m.o. it does not make sense to valdidate datasets without the define.xml included.
But others may think differently.

Regarding the rule change: as of SDTMIG-3.3 you are free to use essentially any label that you want, though I think it should still be not more than 40 characters.
So, if you want to use "Artificial key for the sequence within USUBJID" for "LBSEQ", that would be allowed, as long as that is also provided so in the define.xml.
Reason for this was that it sometimes makes sense to slightly change the label so that it better explains what the variable is about. In the past that caused a validation error which one would then need to explain in the SDRG.

j John
on June 12, 2024

Thank you for your detailed explanation. I understand that with SDTMIG-3.3, we have the flexibility to adjust labels based on our needs. However, for those who do not require label adjustments, this flexibility could potentially delay the detection of errors and also lead to rework. In our production process, we first complete the datasets and then proceed to create the define.xml. Therefore, if an issue is only identified once the define.xml is included, it would require us to revisit and modify the datasets to ensure they align with the standard. Additionally, in cases of label mismatches, we still need to explain the errors like DD0136 in the SDRG. Is there any suggestion on how we could handle such scenarios to maintain efficiency while ensuring adherence to standards?

 

Kind regards,

John

j Jozef
on June 12, 2024

Thanks! What we always teach in the CDISC Define-XML trainings is that it is very advantageous to have a define.xml (also when incomplete) as the "specification" of the datasets to be generated, with all variables and controlled terminology, in advance. There are even software packages on the market that use Define-XML-templates to start the mappings from, from the source data (e.g. EDC output) to SDTM. With these software packages, the mappings and (even early) generated datasets are always aligned with the define.xml.
One then also does not come into such surprises at the very end of the work ...

But the reality is indeed that (too) many companies still treat the define.xml as something they generate after the SDTM datasets have been generated, as a sort of "by-product", as it is required for the submission. They then e.g. use Excel as specifications for the datasets, which I think is not good practice.
The better way is however to let the define.xml "drive" the generation of the SDTM datasets.

In case of SDTMIG-3.3 and DD0136, I would consider this as a "false positive" (i.e. an error in the software), which one (unfortunately) must explain in the SDRG.
In my opinion, it should not be the task of the sponsor to point regulatory authorities to errors in validation software...
But, don't blame me, blame ...

Matt
on June 12, 2024

Hi John,

 

Are you referring to DD0136 (Invalid dataset label value) or DD0137 (Invalid variable label value)?

I can understand the confusion around DD0137 firing for SDTMIG 3.3 and SD0063s removal in SDTMIG 3.3 with the relaxed guidance on variable labeling.

DD0137 has the Publisher ID of 266 which comes from the Define-XML Conformance Guide: https://www.cdisc.org/standards/foundational/define-xml/conformance-rules-define-xml-v2-1.

The guidance does not specify that the rule should exclude SDTMIG 3.3+ in correspondence with the guidance in SDTMIG 3.3 regarding variable labeling. This is something our SME team is currently looking into clarification on.

Until there is clarification, DD0137 applies to all SDTM versions, both with and without SD0063 also being applied. Due to this, it is correct that the rule should simply be explained if needed.

 

Kind regards,

Matt

j John
on June 12, 2024

Thank you Matt. Yes, I was referring to DD0137, not DD0136. Sorry for the confusion..

 

Jozef, thank you for your instruction. I understand that creating the define.xml first is a more correct approach because we have previously received requests from vendors to have specifications before the dataset. We are currently exploring a way to achieve this...

Want a demo?

Let’s Talk.

We're eager to share and ready to listen.

Cookie Policy

Pinnacle 21 uses cookies to make our site easier for you to use. By continuing to use this website, you agree to our use of cookies. For more info visit our Privacy Policy.