I am not surprised ... The two technologies, SAS Transport 5 and XML do not fit together - blame FDA for refusing to move to modern Dataset-XML instead of SAS Transport (5 or 8).
In XML, single quote is a special character. See e.g. https://www.xmltutorial.info/xml/special-characters-in-xml/ (and many other places).
So, if you have a single quote that is in a text that is in a Define-XML attributed, like in "CodedValue" within "CodeListItem", you need to replace the single quotes by their "entity", which is '
If you use a good tool to generate your define.xml (which the Define-XML team encourages people to do BEFORE generating the datasets - the define.xml is "the specification"), the tool will take care of that for you.
Also, please take into account that SAS Transport only supports US-ASCII characters. Although the normal single quote is supported by US-ASCII (code 39), when copying values from other sources (automatically or manually), you may come to surprises when one or more characters are non-US-ASCII. For example, I have seen people having "skew" (MS Office) quotes being copied to the SAS Transport 5 file, leading to ... disaster. Are your single quotes such "skew" characters?
How P21 Validator treats non-ASCII characters that occur in SAS Transport files, I cannot say. Maybe these are ignored ... That may explain the behavior that you found.
With best regards,
Jozef Aerts
CDISC XML Technology team
Recently one of our programmers cut and paste text from an external document to populate variables.
Here is an example. I can see that those single quotes pasted in are odd.
if paramn=26 then PARAM="My PAD makes me feel ‘not normal’";
I created an ADaM define that included this dataset, and typed in these same values into a codelist. But when we validated, something odd happened. Pinnacle had no problem with the values I typed into the codelist.
But the report cited values in the dataset that did not match the codelist, and displayed the dataset values without the bogus single quotes (squares below):
ADPADQOL36PARAM
My PAD makes me feel not normal
SD0037Value for PARAM not found in (ADPADQOL Parameter) user-defined codelistTerminology
I think it’s a bad idea to just cut and paste from some questionable source into a text variable, and that is the problem. But I’d like to have a more concrete reason than that to state. Is there a certain character set that Pinnacle recognizes? Does it have a name? And these bogus characters are not part of that character set?