Tame those %*?*&%$$ PDF files

translation_articles_icon

ProZ.com Translation Article Knowledgebase

Articles about translation and interpreting
Article Categories
Search Articles


Advanced Search
About the Articles Knowledgebase
ProZ.com has created this section with the goals of:

Further enabling knowledge sharing among professionals
Providing resources for the education of clients and translators
Offering an additional channel for promotion of ProZ.com members (as authors)

We invite your participation and feedback concerning this new resource.

More info and discussion >

Article Options
Your Favorite Articles
You Recently Viewed...
Recommended Articles
  1. ProZ.com overview and action plan (#1 of 8): Sourcing (ie. jobs / directory)
  2. Réalité de la traduction automatique en 2014
  3. Getting the most out of ProZ.com: A guide for translators and interpreters
  4. Does Juliet's Rose, by Any Other Name, Smell as Sweet?
  5. The difference between editing and proofreading
No recommended articles found.

 »  Articles Overview  »  Technology  »  Software and the Internet  »  Tame those %*?*&%$$ PDF files

Tame those %*?*&%$$ PDF files

By DocteurPC | Published  12/12/2005 | Software and the Internet | Recommendation:RateSecARateSecARateSecARateSecIRateSecI
Contact the author
Quicklink: http://dut.proz.com/doc/562
Author:
DocteurPC
Canada
Engels naar Frans translator
 
View all articles by DocteurPC

See this author's ProZ.com profile
Tame those %*?*&%$$ PDF files

PDF files are very useful (from now on: PDFs to simplify the text). PDFs allow the publishing of documents in a format that shows/prints the same on (almost) any environment. They also protect documents from malicious modifications. However, as translators, they often make our lives very difficult. We (too) often end up retyping material from the original language/PDF document, even before we start translating. In fact, I hear regularly : “I hate those PDFs!”.

Like most of you, I have a series of shareware that purport to read/open/convert PDFs. But it’s never perfect and many of them have different problems, such as eZEE PDF which cannot treat text in boxes. The latest version of Word treats PDFs (somewhat) but not everybody has this version yet. Or one can purchase Adobe Acrobat for (approx.) $100 USD.

PDF > Word from Micro Application (microapp.com or microapplication.ca) solves a large part of those problems, and for only 39.95 euros or $49.95 Cdn. Once installed, it’s called Solid Converter and not PDF > Word which would make it hard for Windows to treat, since the > has specific meaning for the operating system. Solid Converter imports, extracts and exports both PDF documents from and to Word (maybe it should really be called PDF Word!).

First, let’s look at the two main types of PDF files. There are the text files, which have been saved as PDFs and those can usually be cut and pasted directly into Word documents, but with a total loss of formatting. There are also the “graphic” ones, which are usually scanned documents that have been scanned without OCR software.

Most of the times, we end up having both Adobe Reader and Word opened, and going back and forth between the two (are your ALT TAB keys completely worn off by now?). It is a long, tedious and costly process. Some people prefer to print the whole thing and retype it, which is also long and tedious, and subject to typing mistakes, particularly on long documents.

Solid Converter is used to save this long process or at least to reduce it substantially and to keep the original formatting of the PDF document.

As a test, I made the software gobble up every PDF I have on this computer. Since I got it in May, it includes only the work/files from May to December, but it is still quite substantial. First, it opened everything perfectly and by default, it went back to the previously used sub-directory (e:\translation), while Adobe Reader always goes back to its’ version (or Windows’ version) of where it “thinks” the documents should be (C:\Documents and settings\Georgette Blanchard\My Documents…). This may seem like a minor detail, but when you have to redo the steps 10, 20 or 30 times a day, it gets quite frustrating.

It then offered a series of options to save each file in Word Format.

These are the options, some of which are particularly useful for translators :

Exact = as the name implies…
Flowing = also obvious if the document does not include graphics, pictures or tables;
Table = this is only software that I found to treat tables as tables and not as text;
Continuous = which is again used when the original formatting is not required;
Plain text = this makes it easier to work with since it creates much smaller files than Word files which are then opened in Notepad (this option does not work with a graphic or scanned PDF however).

The next option is to choose whether to save one, many, or all the pages. This is great, because it’s also a way of reducing the size of a PDF, break it in pieces and maybe distribute it amongst many translators.

Furthermore, when one needs to retype some material, having one page at a time makes it much faster since it’s a smaller file.

There are other options, such as conserving character spacing (or not), saving in Word DOC or RTF format and opening the document after conversion. It also can install itself into Word and IE which means that one does not have to open Solid Converter to work, but can do so directly from the Word taskbar or Internet Explorer. All those options can be (semi) permanent but you can change them when required. However, I don’t recommend installing it into IE because it makes it more difficult to open all those PDF links from your own Google search or from a link supplied by an answerer in Kudoz.

Even graphic PDFs were treated, at least partially. For example, in complex graphics with text, designs and pictures, it broke down the pages in “graphic blocks”, including breaking down very large graphic titles into individual blocks of one letter. Those individual blocks can then be pasted into a Word document, one at a time, for easier typing and/or translating.

Finally Solid Converter saves back the file into a proper PDF format with the original formatting, which is preferred by some clients. Obviously, with graphic files, it cannot keep all the original formatting.

Strangely enough, the software I obtained at the Montreal Salon du Livre was in French for the booklet and documentation, but installed in English, it seems because my Windows is in English. If you download one from the web site, I don’t know whether it will or will not be in English, but in any case, the software is so simple that any translator worth his/her salt will understand it. The only thing, which may or may not be an inconvenient, is that from now on, by default, any PDF will be opened with Solid Converter rather than Adobe and it expects to have to convert it, unless one specifies on opening a particular file that Adobe should be used. On the other hand, because it is a much smaller program (23 Megs as opposed to 61 Megs for Adobe Reader version 7), the PDF opens much faster. In fact, on large files, it can open/convert faster than Reader can open the file.

However, it seems to interfere somewhat with Antidote Prisme which is my main French speller/checker, forcing me to reboot if I have used SolidConverter, in order to use Antidote. I’m expecting the newer versions shortly, which should solve this annoyance.

At that price, if I save one hour a week, it pays for itself the first week, not to mention the protection of my eyesight and the reduction of my frustration level. In conclusion, a very good find, either by Internet or buying the software on CD – available in many stores/distributors.



Copyright © ProZ.com, 1999-2024. All rights reserved.
Comments on this article

Knowledgebase Contributions Related to this Article
  • No contributions found.
     
Want to contribute to the article knowledgebase? Join ProZ.com.


Articles are copyright © ProZ.com, 1999-2024, except where otherwise indicated. All rights reserved.
Content may not be republished without the consent of ProZ.com.