PC Review Forums Newsgroups Hardware Scanners Best common file format to use to create PDFs?

Reply

Best common file format to use to create PDFs?

 
Thread Tools Rate Thread
Old 30-05-2006, 04:30 PM   #1
Zak
Guest
 
Posts: n/a
Default Best common file format to use to create PDFs?


> "Zak" <duff@nomail.invalid> schreef in bericht
>>
>>I use XP.
>>
>> I scan a document to a graphics file (eg jpg, gif, bmp, etc)
>> Then I make a PDF from the graphics file.
>>
>> I see that if I save a large bitmap (BMP) and then create a PDF
>> then the PDF is no bigger than if I had compressed the image to a
>> GIF and used that to create a PDF file.
>>
>> In order to preserve quality, avoid artefacts and interference
>> patterns, would it be better to save to a large and detailed
>> intermediate file like a BMP (or even a jpeg) or to save to a
>> small lossless file like a GIF or TIF?



On 26 May 2006, Nils<bla@bla.com> wrote:
>
> Afaik, GIF is not used as a graphical format inside PDF. It is
> probably compressed as a "TIFF" with Cittfax or LZW compression.
> "TIFF" is between brackets, because the embedded stream is also not
> a complete TIFF file, just contains the compressed graphics and
> some extra information like scansize, colordepth, color channels,
> etc.
>
> What happens under the hood in your PDF creation really depends on
> the PDF engine you're using. Many engines actually resize your
> graphics to match the PDF DPI resolution. If you're an experienced
> programmer you could try to generate the PDF yourself, with the
> images in full resolution. The PDF specification is open and can be
> found on the Adobe website.
>
> Nils



Hi Nils and others. I understand now that when I create a PDF from a
image file that the format of the image file is not used inside the
PDF. Instead some other format is used in the PDF (which Nils kindly
suggests may be a specialized form of TIFF).

It is this conversion from my image file format to the internal PDF
format which I want to be done smoothly. I am on XP and I am
wondering if it is better to start with a GIF or a JPG or BMP or
whatever to feed into my PDF creation utility.

I should say that I am starting with a hard copy of a document
created on a word processor. I want to avoid artefacts, unecessarily
jagged lines, moire effects and all that stuff which might come from
transforming from an "awkward2 image format to a PDF.

My PDFs will be for public distribution. I have preferred to scan to
a GIF file rather than a TIFF because I have assumed that when I
circulate the basic image file among certain people that the best
balance between image size and the best chance of them being able to
see the file is a GIF.

To me TIFF feels a bit specialized. For example, I never see a web
page with TIFF images but I see lots of pages with GIFs.

Also there seem to be various compression options for a TIFF (group 3
or 4, LZW, JPEG deflate, none) which might makes it even harder for
me to know what to choose as a common format! The Wikipedia says
documents are often scanned to TIFF group 4 but is that something
which has the best chance of being seen on various PCs in various
organisations that I might need to send it to?
  Reply With Quote
Old 30-05-2006, 05:05 PM   #2
CSM1
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

"Zak" <duff@nomail.invalid> wrote in message
news:Xns97D3A7EDC887F64A18E@127.0.0.1...
>> "Zak" <duff@nomail.invalid> schreef in bericht
>>>
>>>I use XP.
>>>
>>> I scan a document to a graphics file (eg jpg, gif, bmp, etc)
>>> Then I make a PDF from the graphics file.
>>>
>>> I see that if I save a large bitmap (BMP) and then create a PDF
>>> then the PDF is no bigger than if I had compressed the image to a
>>> GIF and used that to create a PDF file.
>>>
>>> In order to preserve quality, avoid artefacts and interference
>>> patterns, would it be better to save to a large and detailed
>>> intermediate file like a BMP (or even a jpeg) or to save to a
>>> small lossless file like a GIF or TIF?

>
>
> On 26 May 2006, Nils<bla@bla.com> wrote:
>>
>> Afaik, GIF is not used as a graphical format inside PDF. It is
>> probably compressed as a "TIFF" with Cittfax or LZW compression.
>> "TIFF" is between brackets, because the embedded stream is also not
>> a complete TIFF file, just contains the compressed graphics and
>> some extra information like scansize, colordepth, color channels,
>> etc.
>>
>> What happens under the hood in your PDF creation really depends on
>> the PDF engine you're using. Many engines actually resize your
>> graphics to match the PDF DPI resolution. If you're an experienced
>> programmer you could try to generate the PDF yourself, with the
>> images in full resolution. The PDF specification is open and can be
>> found on the Adobe website.
>>
>> Nils

>
>
> Hi Nils and others. I understand now that when I create a PDF from a
> image file that the format of the image file is not used inside the
> PDF. Instead some other format is used in the PDF (which Nils kindly
> suggests may be a specialized form of TIFF).
>
> It is this conversion from my image file format to the internal PDF
> format which I want to be done smoothly. I am on XP and I am
> wondering if it is better to start with a GIF or a JPG or BMP or
> whatever to feed into my PDF creation utility.
>
> I should say that I am starting with a hard copy of a document
> created on a word processor. I want to avoid artefacts, unecessarily
> jagged lines, moire effects and all that stuff which might come from
> transforming from an "awkward2 image format to a PDF.


You can create a very clean PDF directly from a Microsoft Word Document
(.doc).
There are programs that act like a printer that creates a PDF, just by
"printing a PDF".

PDF Create! is one such program.

Just search Google for "microsoft word print pdf" without the quotes.
You will get lots of responses.

--
CSM1
http://www.carlmcmillan.com
--

>
> My PDFs will be for public distribution. I have preferred to scan to
> a GIF file rather than a TIFF because I have assumed that when I
> circulate the basic image file among certain people that the best
> balance between image size and the best chance of them being able to
> see the file is a GIF.
>
> To me TIFF feels a bit specialized. For example, I never see a web
> page with TIFF images but I see lots of pages with GIFs.
>
> Also there seem to be various compression options for a TIFF (group 3
> or 4, LZW, JPEG deflate, none) which might makes it even harder for
> me to know what to choose as a common format! The Wikipedia says
> documents are often scanned to TIFF group 4 but is that something
> which has the best chance of being seen on various PCs in various
> organisations that I might need to send it to?

--
CSM1
http://www.carlmcmillan.com
--


  Reply With Quote
Old 30-05-2006, 05:13 PM   #3
Zak
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

On 30 May 2006, CSM1<nomoremail@nomail.com> wrote:

> You can create a very clean PDF directly from a Microsoft Word
> Document (.doc).
> There are programs that act like a printer that creates a PDF, just
> by "printing a PDF".
>
> PDF Create! is one such program.
>
> Just search Google for "microsoft word print pdf" without the
> quotes. You will get lots of responses.



The documents are not written by me. They have been sent to me so they
are in hard copy form and need scanning.
  Reply With Quote
Old 30-05-2006, 06:14 PM   #4
John V-Tracker
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

Why not let a program like PDF-Tools take care of the problems for you -
this will scan direct to PDF for you without the intermediate image
process - all you need to do is make you decision's regarding
optimisation/compression.

you can try it here within the PDF-XChange PRO package (not standard or Lite
versions) - until licensed you will get demo watermarks in the top
right/left corner of each page which do add about 4kb to each page.

http://www.docu-track.com/downloads/users/

--
Best Regards

John Verbeeten
Tracker Software Products
PDF-XChange & SDK, Image-XChange SDK,
PDF-Tools & SDK, TIFF-XChange & SDK, DocuTrack.
Email : johnV@docu-track.com
Support: http://www.docu-track.com/forum/index.php
Web site : http://www.docu-track.com
"Zak" <duff@nomail.invalid> wrote in message
news:Xns97D3A7EDC887F64A18E@127.0.0.1...
>> "Zak" <duff@nomail.invalid> schreef in bericht
>>>
>>>I use XP.
>>>
>>> I scan a document to a graphics file (eg jpg, gif, bmp, etc)
>>> Then I make a PDF from the graphics file.
>>>
>>> I see that if I save a large bitmap (BMP) and then create a PDF
>>> then the PDF is no bigger than if I had compressed the image to a
>>> GIF and used that to create a PDF file.
>>>
>>> In order to preserve quality, avoid artefacts and interference
>>> patterns, would it be better to save to a large and detailed
>>> intermediate file like a BMP (or even a jpeg) or to save to a
>>> small lossless file like a GIF or TIF?

>
>
> On 26 May 2006, Nils<bla@bla.com> wrote:
>>
>> Afaik, GIF is not used as a graphical format inside PDF. It is
>> probably compressed as a "TIFF" with Cittfax or LZW compression.
>> "TIFF" is between brackets, because the embedded stream is also not
>> a complete TIFF file, just contains the compressed graphics and
>> some extra information like scansize, colordepth, color channels,
>> etc.
>>
>> What happens under the hood in your PDF creation really depends on
>> the PDF engine you're using. Many engines actually resize your
>> graphics to match the PDF DPI resolution. If you're an experienced
>> programmer you could try to generate the PDF yourself, with the
>> images in full resolution. The PDF specification is open and can be
>> found on the Adobe website.
>>
>> Nils

>
>
> Hi Nils and others. I understand now that when I create a PDF from a
> image file that the format of the image file is not used inside the
> PDF. Instead some other format is used in the PDF (which Nils kindly
> suggests may be a specialized form of TIFF).
>
> It is this conversion from my image file format to the internal PDF
> format which I want to be done smoothly. I am on XP and I am
> wondering if it is better to start with a GIF or a JPG or BMP or
> whatever to feed into my PDF creation utility.
>
> I should say that I am starting with a hard copy of a document
> created on a word processor. I want to avoid artefacts, unecessarily
> jagged lines, moire effects and all that stuff which might come from
> transforming from an "awkward2 image format to a PDF.
>
> My PDFs will be for public distribution. I have preferred to scan to
> a GIF file rather than a TIFF because I have assumed that when I
> circulate the basic image file among certain people that the best
> balance between image size and the best chance of them being able to
> see the file is a GIF.
>
> To me TIFF feels a bit specialized. For example, I never see a web
> page with TIFF images but I see lots of pages with GIFs.
>
> Also there seem to be various compression options for a TIFF (group 3
> or 4, LZW, JPEG deflate, none) which might makes it even harder for
> me to know what to choose as a common format! The Wikipedia says
> documents are often scanned to TIFF group 4 but is that something
> which has the best chance of being seen on various PCs in various
> organisations that I might need to send it to?



  Reply With Quote
Old 30-05-2006, 06:40 PM   #5
AES
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

In article <Xns97D3AF497897B64A18E@127.0.0.1>,
Zak <duff@nomail.invalid> wrote:

>
> The documents are not written by me. They have been sent to me so they
> are in hard copy form and need scanning.


If the documents are in single-sheet form and can be fed thru a
sheet-feed scanner, the fairly new Fujitsu "ScanSnap" can automatically
produce PDF output (or other formats, at user's option).

It's a bit pricey (circa $400) but it's a pretty nice unit, small, fast,
easy to use, can do both sides at once, auto-select for B&W or color,
and so on.
  Reply With Quote
Old 30-05-2006, 06:41 PM   #6
Dances With Crows
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

["Followup-To:" header set to comp.periphs.scanners.]
On Tue, 30 May 2006 16:30:29 +0100, Zak staggered into the Black Sun and
said:
>> "Zak" <duff@nomail.invalid> schreef in bericht

> On 26 May 2006, Nils<bla@bla.com> wrote:
>>> In order to preserve quality, avoid artefacts and interference
>>> patterns, would it be better to save to a large and detailed
>>> intermediate file like a BMP (or even a jpeg) or to save to a small
>>> lossless file like a GIF or TIF?

>> Afaik, GIF is not used as a graphical format inside PDF. It is
>> probably compressed as a "TIFF" with Cittfax or LZW compression.
>> "TIFF" is between brackets, because the embedded stream is also not a
>> complete TIFF file.
>>
>> What happens under the hood in your PDF creation really depends on
>> the PDF engine you're using. Many engines actually resize your
>> graphics to match the PDF DPI resolution. If you're an experienced
>> programmer you could try to generate the PDF yourself


Zak is obviously not a programmer, let alone an experienced one. Using
PDFlib from C isn't that difficult if you have some experience in C,
though.

>> The PDF specification is open and can be found on the Adobe website.


And creating a PDF using only the spec would take a bunch of experienced
programmers a while. The PDF spec is really, really complex. Its
complexity is one reason why PDFlib and ps2pdf and OpenOffice's "print
to PDF" functionality exist.

> I understand now that when I create a PDF from a image file that the
> format of the image file is not used inside the PDF. Instead some
> other format is used in the PDF (which Nils kindly suggests may be a
> specialized form of TIFF).


Using tiff2ps -> ps2pdf says that a grayscale TIFF ends up converted to
a stream object that can be decoded by the FlateDecode filter.
YPDFEngineMV, obviously.

> It is this conversion from my image file format to the internal PDF
> format which I want to be done smoothly. I am wondering if it is
> better to start with a GIF or a JPG or BMP or whatever to feed into my
> PDF creation utility.


Depends on what you want. Get a good scan, and convert it to
black-and-white if you can do that without losing important info;
that'll make the PDF smaller. JPEG may introduce artifacts, so you
probably don't want to use that. TIFF G4 and TIFF LZW are lossless, so
you may want to use those.

> I should say that I am starting with a hard copy of a document created
> on a word processor.


Yuck. The original WordPerfect or whatever file would've been a much
better place to start from. PDFs with just text in them tend to be
smaller, display faster, and can look good at any zoom level. PDFs made
from images take a longer time to display, are larger, and look terrible
at high zoom levels.

> I have preferred to scan to a GIF file rather than a TIFF because I
> have assumed that when I circulate the basic image file among certain
> people that the best balance between image size and the best chance of
> them being able to see the file is a GIF.


? You're creating a PDF, not distributing a series of image files.

> To me TIFF feels a bit specialized. For example, I never see a web
> page with TIFF images but I see lots of pages with GIFs.


This is because of Hysterical Raisins in the history of web browsers,
and because of Unisys being asses. JPEG compresses better than TIFF-LZW
for lossy color images, and smaller images are preferred, especially when
you're on dialup. TIFF-LZW gives the best lossless compression for
color images, but TIFF-LZW is usually used where losslessness is more
important than file size (like in prepress.) Also, Unisys said they'd
sue anyone who made a TIFF-LZW compressor unless they paid Unisys a
license fee.[0] These things combined made it so that the earliest GUI
browsers didn't support viewing TIFFs, just JPEGs and GIFs. And this
has persisted to the present day... even though TIFF-G4 compresses
better than *anything* else, and does so losslessly, iff your image is
black-and-white.

> Also there seem to be various compression options for a TIFF (group 3
> or 4, LZW, JPEG deflate, none) which might makes it even harder for me
> to know what to choose as a common format! The Wikipedia says
> documents are often scanned to TIFF group 4 but is that something
> which has the best chance of being seen on various PCs in various
> organisations that I might need to send it to?


....what? If somebody can't figure out how to view a Group4 TIFF,
they're probably computer-illiterate. Anyway, aren't you making a PDF
here? It doesn't matter what the original image format was if it's been
PDFed. Acrobrat Reader can decode the image data within a PDF, as long
as the PDF library/PDF writer/whatever that created that PDF wasn't
smoking crack. Anyway, HTH,

[0] Fortunately, their patent (on a *mathematical method*!) expired a
couple of years ago, so all the Free stuff can write LZW now, which is a
win for everybody.

--
Matt G|There is no Darkness in eternity/But only Light too dim for us to see
Screw up your courage! You've screwed up everything else.
  Reply With Quote
Old 30-05-2006, 07:50 PM   #7
Aandi Inston
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

Zak <duff@nomail.invalid> wrote:

>> "Zak" <duff@nomail.invalid> schreef in bericht
>>>
>>>I use XP.
>>>
>>> I scan a document to a graphics file (eg jpg, gif, bmp, etc)
>>> Then I make a PDF from the graphics file.


Never use JPEG for this purpose. GIF and BMP are not the normal
choice.
>>>
>>> I see that if I save a large bitmap (BMP) and then create a PDF
>>> then the PDF is no bigger than if I had compressed the image to a
>>> GIF and used that to create a PDF file.


Yes. The image file format isn't stored in the PDF.
>>>
>>> In order to preserve quality, avoid artefacts and interference
>>> patterns, would it be better to save to a large and detailed
>>> intermediate file like a BMP (or even a jpeg) or to save to a
>>> small lossless file like a GIF or TIF?


Absolutely not JPEG. BMP has no advantage over TIFF and GIF has
disadvantage.

I don't really follow your question. since GIF and TIFF use lossless
compression, then preserve quality and avoid artefacts and
interference patterns, by definition.

You may have the choice of whether to use lossless compression, or
not, in making the PDF.
>
>My PDFs will be for public distribution. I have preferred to scan to
>a GIF file rather than a TIFF because I have assumed that when I
>circulate the basic image file among certain people that the best
>balance between image size and the best chance of them being able to
>see the file is a GIF.


If you are distributing the image file, that may be true. If you are
preparing the PDF file from the image file, it is not relevant at all.
>
>To me TIFF feels a bit specialized. For example, I never see a web
>page with TIFF images but I see lots of pages with GIFs.


That's because web browsers can display GIF and JPEG images as
standard, so web graphics are in those formats. That doesn't make them
in any sense "best".

TIFF is the industry standard format for document scanning, by a very
wide margin.
>
>Also there seem to be various compression options for a TIFF (group 3
>or 4, LZW, JPEG deflate, none) which might makes it even harder for
>me to know what to choose as a common format!


These options are not relevant. The PDF file doesn't include the TIFF
information, only the image from the TIFF file.
----------------------------------------
Aandi Inston quite@dial.pipex.com http://www.quite.com
Please support usenet! Post replies and follow-ups, don't e-mail them.

  Reply With Quote
Old 30-05-2006, 09:27 PM   #8
Roger
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

On Tue, 30 May 2006 16:30:29 +0100, Zak <duff@nomail.invalid> wrote:

>> "Zak" <duff@nomail.invalid> schreef in bericht
>>>
>>>I use XP.
>>>
>>> I scan a document to a graphics file (eg jpg, gif, bmp, etc)
>>> Then I make a PDF from the graphics file.


As far as I can see there really is no best common file format to
convert. If it'll convert it'll work. However the size of the
original file will have a direct bearing on the size of the pdf.

If you are doing something like creating a newsletter, flyer, or
Internet distribution then why not use the original doc file?

I handle several newsletters on line and in print.
With Adobe pro any Office and I believe Word Perfect doc can be
converted directly to a pdf. However any images in the documents
should be of the proper size and resolution for the end media. I've
had Word docs sent to me that had the full original images with just
the physical dimensions set. They were still the original one or two
meg images set to a dimension of 2 X 3 inches. These produced nice
looking pdfs, but of many megabytes. Having the images set to the
proper resolution (300 ppi for print and about 100 ppi for screen)
dropped the pdf to less than 100K.

Also not all pdf creators are created equal. About a year ago I tried
using open office to convert a word doc and produced one that was
about 3 to 4 times the size of one using Adobe Pro. This is fine for
printed media, but may (or may not) be a royal pain in the back side
for on-line viewing.

For on-line I much prefer HTML rather than pdfs as the HTML will be
faster to load and more compact. At least it will if it wasn't created
by saving a Word doc as HTML or using Front Page to create it. Those
are huge. OTOH converting to a pdf is faster and much easier and I do
use them when the pdfs are relatively small.

Roger Halstead (K8RI & ARRL life member)
(N833R, S# CD-2 Worlds oldest Debonair)
www.rogerhalstead.com
  Reply With Quote
Old 30-05-2006, 09:49 PM   #9
Aandi Inston
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

Roger <Delete-Invallid.stuff.groups@tm.net> wrote:

>On Tue, 30 May 2006 16:30:29 +0100, Zak <duff@nomail.invalid> wrote:
>
>>> "Zak" <duff@nomail.invalid> schreef in bericht
>>>>
>>>>I use XP.
>>>>
>>>> I scan a document to a graphics file (eg jpg, gif, bmp, etc)
>>>> Then I make a PDF from the graphics file.

>
>As far as I can see there really is no best common file format to
>convert. If it'll convert it'll work. However the size of the
>original file will have a direct bearing on the size of the pdf.


No. The size of the original will usually have no effect whatsoever,
though some PDF creation methods are influenced by it.

>With Adobe pro any Office and I believe Word Perfect doc can be
>converted directly to a pdf.


With Acrobat Pro or Acrobat Standard, any file you can print can be
converted directly to a PDF.

>. Having the images set to the
>proper resolution (300 ppi for print and about 100 ppi for screen)
>dropped the pdf to less than 100K.


Or, you could use Acrobat options to reduce the resolution.
----------------------------------------
Aandi Inston quite@dial.pipex.com http://www.quite.com
Please support usenet! Post replies and follow-ups, don't e-mail them.

  Reply With Quote
Old 31-05-2006, 03:32 AM   #10
tacit
Guest
 
Posts: n/a
Default Re: Best common file format to use to create PDFs?

In article <Xns97D3A7EDC887F64A18E@127.0.0.1>,
Zak <duff@nomail.invalid> wrote:

> I should say that I am starting with a hard copy of a document
> created on a word processor.


Why don't you just start with the word processor file, and not with a
hardcopy at all? Go straight from the word processor file to PDF.

> I want to avoid artefacts, unecessarily
> jagged lines, moire effects and all that stuff which might come from
> transforming from an "awkward2 image format to a PDF.
>
> My PDFs will be for public distribution. I have preferred to scan to
> a GIF file rather than a TIFF because I have assumed that when I
> circulate the basic image file among certain people that the best
> balance between image size and the best chance of them being able to
> see the file is a GIF.


A GIF is almost the worst possible choice to use, because GIF images
contain a very small number of colors, and because of this they don't
tend to downsample smoothly.

Use TIFF. Anything that can read a PDF, can read a PDF, period. It does
not matter what you start with; once it is turned into a PDF, it is a
PDF. However, a TIFF image will downsample and compress smoothly.

> To me TIFF feels a bit specialized. For example, I never see a web
> page with TIFF images but I see lots of pages with GIFs.


That doesn't mean a GIF is the best format to use for general purposes,
however.

>Also there seem to be various compression options for a TIFF (group 3
>or 4, LZW, JPEG deflate, none) which might makes it even harder for
>me to know what to choose as a common format!


You do not need to choose any of these. You do not need to compress the
TIFF at all.

Scan a TIFF, make a PDF, send out the PDF, you're done. Or, better yet,
do not use your scanner at all. Start with the word processor file, make
a PDF--it'll be smaller and far higher quality.

--
Art, photography, shareware, polyamory, literature, kink:
all at http://www.xeromag.com/franklin.html
Nanohazard, Geek shirts, and more: http://www.villaintees.com
  Reply With Quote
Reply



Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off