Extracting Text From PDF Files


I like the use of PDF’s, they work well and (usually) appear the same on ever computer. One of the advantages of PDF’s is the text embedded within the file, and the abilty to manipulate. Although many PDF readers have functions to copy and paste text, whats quicker than creating an Automator script to extract PDF text into a text file.

This process only take a couple of steps so read carefully.

1) Open Automator, found in Applications > Utilities

2) In Automator drag out “Get Selected Finder Items”, then “Extract PDF Text”. I recommend changing the output option to Rich Text.

3) Save the file. Either as an application or a file.

4) Test. Drag a PDF file onto the Automator file and let it do its work, after a short time you will have a text file with the extracted text.

The text file which contains the outputted text will have every single piece of text that is in the orginal PDF file. As a result it may contain random bits of text, but it wont take long to remove the bits you don’t need.

