Thursday, September 28, 2023
12
rated 0 times [  12] [ 0]  / answers: 1 / hits: 7812  / 11 Months ago, fri, november 18, 2022, 12:55:51

It's easy to find the page count of a PDF document from the command line:



pdfinfo sample.pdf | grep ^Pages:


... but I haven't been able to find a similar method for odt files and other office documents.



Is there a way to programmatically determine the page count of these documents?


More From » command-line

 Answers
0

Thanks for all the answers, everyone. With your help I was able to compile a list of commands that can extract the page count from almost all relevant office documents:



DOCX/PPTX



unzip -p 'sample.docx' docProps/app.xml | grep -oP '(?<=<Pages>).*(?=</Pages>)'

unzip -p 'sample.pptx' docProps/app.xml | grep -oP '(?<=<Slides>).*(?=</Slides>)'


Note: unzip can be installed with sudo apt-get install unzip.



DOC/PPT



wvSummary sample.doc | grep -oP '(?<=of Pages = )[ A-Za-z0-9]*'

wvSummary sample.ppt | grep -oP '(?<=of Slides = )[ A-Za-z0-9]*'


Note: wvSummary (case-sensitive!) is part of the wv package. Install it with sudo apt-get install wv.



ODT



unzip -p sample.odt meta.xml | grep -oP '(?<=page-count=")[ A-Za-z0-9]*'


PDF



pdfinfo sample.pdf | grep -oP '(?<=Pages:          )[ A-Za-z0-9]*'


Note: pdfinfo is part of poppler-utils and should come preinstalled on Ubuntu.



DJVU



djvused -e "n" sample.djvu


Note: djvused is part of the djvulibre-bin package and may be installed with sudo apt-get install djvulibre-bin.


[#30824] Sunday, November 20, 2022, 11 Months  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
brailloni

Total Points: 122
Total Questions: 108
Total Answers: 108

Location: North Korea
Member since Tue, Apr 4, 2023
6 Months ago
brailloni questions
Tue, Jun 21, 22, 21:51, 1 Year ago
Thu, Jul 29, 21, 05:25, 2 Years ago
Sat, Apr 2, 22, 17:32, 2 Years ago
Mon, Oct 31, 22, 21:39, 12 Months ago
;