Make ClearMetadataPreprocessor work for cells of any type, and some related improvements #2239
+24
−13
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello. I've been pleasantly surprised to discover that
nbconvertcan now clear metadata fields. That makes for much nicer diffs when you're trying to revision-control your notebooks.But then I soon noticed that not all of the metadata is actually being removed. More specifically, markdown cells seemed to retain their
ids at least. So today, after looking through the source code and doing some Git archaeology, (I think) I've finally figured out why.When the
ClearMetadataPreprocessorwas added in #805, it was based on theClearOutputPreprocessorwhich too performs some metadata transformations. Removing cell output only makes sense forcodecells, and so there is a a check for that. But the same check is now done for general metadata clearing, which leads to this unexpected behavior.While trying to fix it, I also ended up making a patch that clarifies some of the documentation present in that file. And then I recalled that at first it wasn't really clear to me how I should go about enabling the
ClearMetadataPreprocessor, so I also wrote a patch to add a--clear-metadatacommand line flag that works just like--clear-outputI marked this PR as a draft, because:
I still need to test this truly works correctlyI have now built and tested it (in aguix shell jupyter --with-patch=python-nbconvert=metadata.diff)ClearOutputPreprocessor.remove_metadata_fieldsandClearMetadataPreprocessor.preserve_cell_metadata_maskinteract is rather confusing, so I would like to fix that