Home Page
Back
All the Graphics
Written:
13-Jan-2001
Revised:
17-Jan-2001
|
|
Embedded Graphics and Word File Bloat
By Thiravudh Khoman
Recently, on the "Bangkok General" mailing list, there was a question
from Frank Lombard as to why Microsoft Word 97 files, when embedded with
graphics files, balloon in size so dramatically. Or more to the point,
why they grow several times larger than the embedded graphic files
themselves.
The most salient responses from the Bangkok General community (sorry,
I forgot who said what) suggested:
- To use "Save As" as opposed to "Save" when saving files.
- To disable "Fast Save" in Word to prevent the saving/accumulation of
revision information.
- Due to Word inefficiencies, to save graphic embedded files in Adobe
Acrobat format instead.
While these are good suggestions - and I tend to agree with them -
they still don't quite address the mystery of what causes those bloated
Word files. Anyway, I decided to run a few tests.
I started out with a JPEG file, being a photo shot from a digital
camera. Since my digital camera is set to capture pictures at 768 x
1024 pixels, most of my saved files are in the range of 135-145Kb each.
What happens when I tried inserting this file into Word 97 and Word 2000
respectively? See below:
File Name |
File Size in Kb |
Notes |
---|
908.JPG |
142,277 |
Unaltered graphic file |
908-97.DOC |
997,888 |
Saved with Word 97 |
908-2K.DOC |
163,328 |
Saved with Word 2000 |
Oops, I seem to have re-created the problem. But wait, Word 2000
seems to handle the situation much better than Word 97. Let's try
another file:
File Name |
File Size in Kb |
Notes |
---|
912.JPG |
136,592 |
Unaltered graphic file |
912-97.DOC |
1,112,576 |
Saved with Word 97 |
912-2K.DOC |
157,696 |
Saved with Word 2000 |
Yep, this seems to confirm the first example and the fact that Word
97 seems to be the culprit. But hold on - take a look at these:
File Name |
File Size in Kb |
Notes |
---|
945.JPG |
140,961 |
Unaltered graphic file |
945-97.DOC |
160,768 |
Saved with Word 97 |
945-2K.DOC |
161,792 |
Saved with Word 2000 |
947.JPG |
133,875 |
Unaltered graphic file |
947-97.DOC |
153,600 |
Saved with Word 97 |
947-2K.DOC |
154,624 |
Saved with Word 2000 |
Oops again, what gives? Where's the bloat? But wait still, I'm
going to confuse you some more. I "modified" the four graphic files
(908.JPG, 912.JPG, 945.JPG and 947.JPG somewhat - I'll explain how
later) and here's the results I now get:
File Name |
File Size in Kb |
Notes |
---|
908.JPG |
58,701 |
Altered graphic file |
908-97.DOC |
77,824 |
Saved with Word 97 |
908-2K.DOC |
79,360 |
Saved with Word 2000 |
912.JPG |
108,309 |
Altered graphic file |
912-97.DOC |
127,488 |
Saved with Word 97 |
912-2K.DOC |
128,512 |
Saved with Word 2000 |
945.JPG |
70,124 |
Altered graphic file |
945-97.DOC |
89,600 |
Saved with Word 97 |
945-2K.DOC |
90,624 |
Saved with Word 2000 |
947.JPG |
51,789 |
Altered graphic file |
947-97.DOC |
71,168 |
Saved with Word 97 |
947-2K.DOC |
72,192 |
Saved with Word 2000 |
Whoa! Not only is the bloat gone from BOTH Word 97 and Word 200, but
the files are quite a bit smaller as well.
Here are the "answers" and some observations:
- Word 2000 DOES indeed handle graphic-embedded files better than Word
97. In none of the above cases did a Word 2000 file balloon several
times larger than the graphic file itself as Word 97 is wont to do.
- Word 97 does NOT ALWAYS create bloated files, as 945.JPG and 947.JPG
can attest - although it did muck up 908.JPG and 912.JPG pretty
horrendously. (Note that the four files are all pretty close in size.)
The only difference I can ascertain in these files is that 945.JPG and
947.JPG are both "lighter" (i.e. contain more "white") than 908.JPG and
912.JPG. Go figure.
- Okay, now what did I do to the graphic files in the last table?
Answer: I removed all the "metadata" embedded in these files. When I
need to fiddle with my graphic files, I usually grab for ACD Systems'
(https://www.acdsystems.com) ACDSee
v3.1. When manipulating files (cropping, reducing, enhancing, or even
do-nothing re-saving) ACDSee ends up removing the metadata. Not only
does this make the file smaller, but it also seems to remove the code
which causes Word 97 to blow up files beyond all reason.
Now, what in the world is "metadata"? Metadata seems to be embedded
data which documents how the file was created; in a way, it's similar to
ID3 tags in MP3 files. ACDSee allows you to read the metadata when you
look at a file's "Properties" (figure 1). I suspect it may be
more than that, though, since I can't imagine how removing a few tags
can affect the bloat factor so completely. But then I'm hardly an
expert in these matters.
Bottom line: Use a program like ACDSee v3.1 to remove the metadata
from the graphic files first, and then embed the files into Word.
|
|