Skip to content

sidmishraw/docpruner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocPruner

Prunes the bad PDFs(probably scanned images of IEEE documents from IEEE Xplore) and moves them out of the input_pdfs folder and moves folders pdf_jsons and pdf_grouped_jsons out of the cs267_project folder so that the PDF - JSON generation process can be started from scratch.

The artifact/jar (executable) jar is located in here

Usage:

java -jar path_to_DocPruner.jar <path-to-pdfprocessor.log> <path-to-pdf_jsons> <path-to-pdf_grouped_jsons>

Incase of concerns contact: [email protected]

About

DocPruner is an utility for pruning bad PDFs for cs 267 project and PDF processor

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages