Was mentioned in this thread, but I can also endorse qpdf as being a great library.
It gives you a JSON representation of the PDF data structure. What's nice is that doesn't hide the underlying format but it takes care of a lot of the low level edge cases for you.
It gives you a JSON representation of the PDF data structure. What's nice is that doesn't hide the underlying format but it takes care of a lot of the low level edge cases for you.