Skip to main content
ICT Skills 2

Do you know how to save Adobe PDF content as text? (23/24)

Yes

Find out

If you want to save *.pdf files as *.txt files, e.g. if you want to add text from an Adobe *.pdf file to a document created with another application (e.g. Microsoft Word) or if you want to send text e.g. to a Braille printer etc., you do not need the full version of Adobe Acrobat. Acrobat Reader allows you to save *.pdf as text along with the graphics. You can also export text to the following file formats: Rich Text Format (*.rtf), XML, HTML and Word (*.doc).

If you have Adobe Acrobat R

Sometimes you may want to save *.pdf files as *.txt files, for example, when adding text from an Adobe *.pdf file to a document created with another application (such as Microsoft Word), or when sending text to a Braille printer, etc. In these cases, you do not need the full Adobe Acrobat version because the Acrobat Reader allows you to save *.pdf content as text, including graphics. It is even possible to export text to Rich Text Format (*.rtf), XML, HTML and Word (*.doc).

If you have Adobe Acrobat Reader installed and you want to export text:

  • Go to File|Export Document to Text

If you have the full version of Acrobat:

  • Choose File|Save As and select from the dropping list menu Format the option Text (Accessible)
  • Or go to File|Save As and select from the dropping list menu Format the option Rich Text Format
  • Or the file format you wish (e.g. XML, HTML, *.doc etc.)

Note: This works also with documents scanned and saved as PDF. 

Depending on the document’s security settings, you may or may not be able to extract text from a .pdf. Sometimes, authors encrypt documents in Acrobat to prevent users from copying and pasting text. In order to change the security settings in an existing document:

  1. Open the document (in Acrobat 6.0) and enter the appropriate password, if required
  2. Then go to File|Document Security and choose Change Settings
  3. Select 128-Bit RC4 (in Acrobat 6.0) from the Encryption Level menu
  4. Click on Enable Content Access for the Visually Impaired.
  5. Select any other security options and click OK

There are tools that convert PDF into Word files without changing the layout (see e.g. Able2Extract for convertions into Word and Excel, BCL Drake for conversions into RTF, Free PDF Text Reader to save PDF as text files, etc.); bear in mind that the results are not always optimal. For translators it is advisable to always ask the client for the original source file instead of trying to convert the .pdf file.   

eader installed and want to export text:

Go to File Save as Text
If you have the full version of Acrobat:

Click File|Save As and select Text (available (*.txt)) from the File Type drop-down list.
Or, click File|Save As and select RTF (Rich Text Format (*.rtf)) from the File Type drop-down list.
Or select another file type (e.g.: XML, HTML, *.doc, etc.).
Note: This also works with scanned documents saved as PDF files.

It depends on the security settings of the document whether you can extract text from a *.pdf file or not. Some Acrobat documents are encrypted to prevent users from copying the text. To change the security settings in a document:

Open the document (in Acrobat 6.0) and enter the associated password, if required.
Now click Document Properties in the File menu, and then click Security and select Change Settings.
From the Encryption Algorithm menu (in Acrobat 6.0), select 128-bit RC4.
Check Allow text access for speech output programs for the visually impaired.
Optionally, you can select other security settings. When you are done, click OK.
There are tools that allow you to convert PDF files to Word files without changing the layout (for example: Able2Extract to convert to Word and Excel files, BCL Drake to convert to RTF, Free PDF Text Reader to save as a text file, etc.). Note, however, that the results are not always optimal. Translators should always ask the client for the original source files instead of converting a *.pdf file.   

Why is this information important for translators and translation teachers?
Since *.pdf is one of the most commonly used formats to "protect" documents which are going to be published on the Internet or delivered via email to translators, it is important that translators know how to save those documents without the formatting information (that is to say as text files), so that they can later on save them in other formats they want.

Next