• VeganPizza69 Ⓥ
    link
    fedilink
    English
    5012 days ago

    Purge metadata, convert PDF to rendered graphics (including bitmaps), add OCR layer.

    • @xenoclast@lemmy.world
      link
      fedilink
      English
      2412 days ago

      There are tools for this already… but it sure would be nice to have a Firefox plugin that scrubs all metadata on downloads by default.

      (Note I’m hoping this exists and someone will Um, Actually me)

      • nearhat
        link
        fedilink
        English
        1412 days ago

        It’s a multi step process, but if you still have the XPS Viewer from windows 10, you can ‘print’ the file to XPS, then open it in the XPS Viewer and ‘print’ to PDF using your favourite print to pdf solution. That strips the metadata but doesn’t rasterize everything.

          • nearhat
            link
            fedilink
            English
            110 days ago

            I tried that before, but was unsuccessful in clearing out metadata. Whatever options I tried, PDF-to-PDF just output an identical file with a different name.

      • lastweakness
        link
        fedilink
        English
        311 days ago

        You could write a script to automatically watch for new files in a folder and strip metadata from every file i guess. I had done something like that for images way before.