Wednesday, May 15, 2024
5
rated 0 times [  5] [ 0]  / answers: 1 / hits: 3292  / 2 Years ago, mon, october 10, 2022, 5:47:33

I'd like to know if there's a CLI command to get the "Created" timestamp from the "Document" tab (from file properties of a PDF) in the CLI?



I know that I can use stat to get Access/Modified/Changed info from the filesystem, but with the meta-data in the "Document" tab being embedded in the file itself, I'm not sure how to go about extracting it via the CLI.



The reason I need to do this is to create a list of filenames along with "Created" timestamps for about 22,000 PDF files. Obviously, this is something far better suited to the CLI than the GUI.


More From » command-line

 Answers
2

If you install the poppler-utils package, you can do this using the pdfinfo command. For example:



$ pdfinfo OBEX-1.3.pdf 
Title: Microsoft Word - OBEX13.doc
Author: Daphne
Creator: PScript5.dll Version 5.2
Producer: Acrobat Distiller 5.0.5 (Windows)
CreationDate: Wed Feb 5 11:12:32 2003
ModDate: Wed Feb 5 11:12:32 2003
Tagged: no
Pages: 95
Encrypted: no
Page size: 612 x 792 pts (letter)
File size: 545666 bytes
Optimized: yes
PDF version: 1.3


You should be able to extract the creation date from this output using standard tools like sed or awk.



If you want something a bit more programatic, you could use the poppler library directly. There are bindings for many popular languages including Python (through the python-poppler package).


[#39900] Monday, October 10, 2022, 2 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
piscen

Total Points: 134
Total Questions: 117
Total Answers: 133

Location: Indonesia
Member since Wed, Jul 7, 2021
3 Years ago
;