In ammonia, elements which are not whitelisted are replaced by their content by default, its descendants are only removed if the element is added to clean_content_tags.
I've no issue with that, however as I was checking if it would be possible translate between configurations for the upcoming "web sanitizer" APIs and ammonia, in order to have consistent behaviour between front and backend components, I realised that they made the opposite choice: as far as I can tell, in an "allow configuration" (which corresponds to ammonia's behaviour) elements which are not whitelisted are entirely removed from the document subtree included, to remove the tag but keep the content the element has to be added to replaceWithChildrenElements (replaceElementWithChildren in the Sanitizer API, it's very confusing that the names don't match).
Either version is fine but as far as I can tell it's not possible to translate between the two, because it would require adding every element which is in neither elements nor replaceWithChildrenElements to ammonia's clean_content_tags, and that's an infinite set.
Thus in order for Web Sanitizer configurations to be expressible in Ammonia, as far as I can tell Ammonia would need one of:
- a content whitelist mode instead of the current blacklist mode (that is it would clean_content by default, and anything in the list would be replaced by its content)
- a content callback which would allow the user to do either as they wish (based on the tag name)
- maybe a more generic
tag_filter whose result would be one of Allow/Deny/Content, respectively keeping the element, removing the subtree, or swapping the element with its subtree (could also be an arbitrary element replacement but that would probably require making rcdom part of the ammonia API, and since it's not been done yet I assume that's undesirable)
Would any of that be of interest? And if so which would you see as the more desirable / likely option?
In ammonia, elements which are not whitelisted are replaced by their content by default, its descendants are only removed if the element is added to
clean_content_tags.I've no issue with that, however as I was checking if it would be possible translate between configurations for the upcoming "web sanitizer" APIs and ammonia, in order to have consistent behaviour between front and backend components, I realised that they made the opposite choice: as far as I can tell, in an "allow configuration" (which corresponds to ammonia's behaviour) elements which are not whitelisted are entirely removed from the document subtree included, to remove the tag but keep the content the element has to be added to
replaceWithChildrenElements(replaceElementWithChildrenin the Sanitizer API, it's very confusing that the names don't match).Either version is fine but as far as I can tell it's not possible to translate between the two, because it would require adding every element which is in neither
elementsnorreplaceWithChildrenElementsto ammonia'sclean_content_tags, and that's an infinite set.Thus in order for Web Sanitizer configurations to be expressible in Ammonia, as far as I can tell Ammonia would need one of:
tag_filterwhose result would be one of Allow/Deny/Content, respectively keeping the element, removing the subtree, or swapping the element with its subtree (could also be an arbitrary element replacement but that would probably require making rcdom part of the ammonia API, and since it's not been done yet I assume that's undesirable)Would any of that be of interest? And if so which would you see as the more desirable / likely option?