SearchWP Xpdf Integration
Warning: This Extension requires the use of exec() and also requires you to install Xpdf (upload a file to a non-public location) yourself.
SearchWP offers the unique feature of extracting plain text from PDF files uploaded to your WordPress website. Out of the box, SearchWP attempts to do this using only PHP, but due to the complexity and variation of the PDF format that sometimes results in content not being accurately extracted. Enter Xpdf.
Xpdf is a command line utility that must be installed on your server in order for this Extension to work. Installation is simple, and instructions are included.
Using the Xpdf Integration Extension you can offload all the work PHP has to do in processing your PDF files to Xpdf, which is extremely fast and accurate when extracting content from your PDFs. After activating the Extension, you will need to follow the installation instructions. Once installed, SearchWP will offload the PDF content extraction process to Xpdf.
Installing Xpdf tools
Using this extension you can utilize Xpdf to extract the content from your PDFs.
IMPORTANT: Xpdf is not provided in this download. You must download Xpdf and upload it to a non-public (outside your Web root) location
Xpdf offers binary distributions of Xpdf tools for both Windows and Linux at http://www.xpdfreader.com/download.html.
Manually Testing Xpdf Integration
After uploading and activating the Xpdf Integration Extension and defining your path to pdftotext, you can manually confirm that Xpdf text extraction is working as expected on specific PDFs uploaded to your Media library.