Wednesday, May 8, 2024
12
rated 0 times [  12] [ 0]  / answers: 1 / hits: 7894  / 2 Years ago, fri, november 18, 2022, 12:55:51

It's easy to find the page count of a PDF document from the command line:



pdfinfo sample.pdf | grep ^Pages:


... but I haven't been able to find a similar method for odt files and other office documents.



Is there a way to programmatically determine the page count of these documents?


More From » command-line

 Answers
0

Thanks for all the answers, everyone. With your help I was able to compile a list of commands that can extract the page count from almost all relevant office documents:



DOCX/PPTX



unzip -p 'sample.docx' docProps/app.xml | grep -oP '(?<=<Pages>).*(?=</Pages>)'

unzip -p 'sample.pptx' docProps/app.xml | grep -oP '(?<=<Slides>).*(?=</Slides>)'


Note: unzip can be installed with sudo apt-get install unzip.



DOC/PPT



wvSummary sample.doc | grep -oP '(?<=of Pages = )[ A-Za-z0-9]*'

wvSummary sample.ppt | grep -oP '(?<=of Slides = )[ A-Za-z0-9]*'


Note: wvSummary (case-sensitive!) is part of the wv package. Install it with sudo apt-get install wv.



ODT



unzip -p sample.odt meta.xml | grep -oP '(?<=page-count=")[ A-Za-z0-9]*'


PDF



pdfinfo sample.pdf | grep -oP '(?<=Pages:          )[ A-Za-z0-9]*'


Note: pdfinfo is part of poppler-utils and should come preinstalled on Ubuntu.



DJVU



djvused -e "n" sample.djvu


Note: djvused is part of the djvulibre-bin package and may be installed with sudo apt-get install djvulibre-bin.


[#30824] Sunday, November 20, 2022, 1 Year  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
brailloni

Total Points: 122
Total Questions: 108
Total Answers: 108

Location: North Korea
Member since Tue, Apr 4, 2023
1 Year ago
;