Print to PDF in Firefox creates non-searchable PDF - please help

A

Adam

Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox


The following was originally posted to "mozilla.support.firefox" ...

Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/mo...=section&OfferingID=1532219&SectionID=5270686

The problem does not occur with IE but I prefer to find a fix for Firefox.

Any ideas?
 
A

Adam

Ghostrider said:
Unless the user directs the Adobe PDF printer to send the
*.pdf file to a specific folder, it would end up in a default
WinXP folder in Documents and Settings. I had no problem in
printing the link as a file to a location of my choice and
then finding it.

GR

Are you able to "search" for text in the newly generated PDF?
 
P

Paul

Adam said:
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox


The following was originally posted to "mozilla.support.firefox" ...

Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/mo...=section&OfferingID=1532219&SectionID=5270686

The problem does not occur with IE but I prefer to find a fix for Firefox.

Any ideas?

This is what I see.

http://img696.imageshack.us/img696/2702/searchable.gif

Method:

1) Firefox print to Postscript.
2) Toss PostScript in Acrobat Distiller.
3) Open in Reader. Search for the word "fundamental" and it is located.

The file did have some image content, but I think that's the logo that
was on the first page.

I can't see a good reason, for a print of a web page, to have the
"do not copy" security setting in PDF enabled. The file can be
virtually untouchable, if all the security flags are enabled.
Check the Distiller settings, and see if something in there has
broken loose. Check while viewing the document in Reader, and
see what security settings in there are properties of the
document.

Paul
 
A

Adam

Paul said:
This is what I see.

http://img696.imageshack.us/img696/2702/searchable.gif

Method:

1) Firefox print to Postscript.
2) Toss PostScript in Acrobat Distiller.
3) Open in Reader. Search for the word "fundamental" and it is located.

The file did have some image content, but I think that's the logo that
was on the first page.

I can't see a good reason, for a print of a web page, to have the
"do not copy" security setting in PDF enabled. The file can be
virtually untouchable, if all the security flags are enabled.
Check the Distiller settings, and see if something in there has
broken loose. Check while viewing the document in Reader, and
see what security settings in there are properties of the
document.

Paul

Thanks (Guru Paul), but I am still not able to search for text in
the PDF generated from PS file in Distiller. I get the "crosshair" cursor
when
cursor is positioned over the PDF generated.

Here's the Adobe PDF Document Properties (from Print Properties) ...
http://img38.imageshack.us/img38/2650/adobepdfdocumentpropert.gif

Here's the Adobe PDF Settings (from Distiller) ...
http://img838.imageshack.us/img838/2882/adobepdfsettings.gif

Here's the Adobe PDF Security (from distiller) ...
http://img820.imageshack.us/img820/7695/adobepdfsecurity.gif

Where do you see "do not copy" security setting in PDF enabled?

Also, I wonder if any of the following has to do with my troubles ...
=====================================================================
The ANSI University outreach program is now being distributed through the
ANSI Site License Portal.
The following components are required to access your documents:
1. Internet Access - http://slportal.ansi.org/
2. Adobe Reader 5.0 or newer - http://get.adobe.com/reader/
3. FileOpen DRM Plug-in for Acrobat - http://plugin.fileopen.com/
=====================================================================
 
D

David H. Lipman

From: "Bill in Co said:
But WHY would one want to?
Why would someone want to change something that is already fine as it is?? To change,
for change's sake? Thanks, but no thanks. :)

Sounds like the same logic as going to the newer versions of Office. More bloat, and
more useless features.

Because when you print from FF v11 to PDFCreator or Adobe Printer the PDF is searchable as
Adam requires.
 
P

Paul

David said:
To add to Adams question, I will state my findings previously provided
in his initial query in the Mozilla Firefox news group.

If printed to Adobe Professional v9.5.0 or PDFCreator or to a PostScript
file and distilled to a PDF from Firefox v3.6.28 the PDF is rendered as
a graphic and is not searchable.

If printed to PDFCreator from Firefox v11 the PDF is searchable.

I believe this to be a FF v3.6.28 rendering issue.

After investigating a bit, it seems Firefox v3 switched to cairographics.
When Cairo runs into situations it cannot handle with simple primitives
(letter uses letter primitive, line uses line primitive, a straight mapping),
it uses bitmap rendering as a fallback. If you get a solid image going through
a PDF printer output, it could be something like that. Purely a guess, as
I can't really see in this situation, how Cairo would help. You'd be
doing something like HTML ---> Cairo ---> GDI??? ---> AdobePDFprinter ---> PDF.
I don't see how Cairo really helps in a major way. Must be missing the point.

The workaround is to try "PrintPDF" add-on, which did yield a searchable
PDF for the ucsc-extension.edu web page. Using this, adds File : Print To PDF
to the Firefox menu, after installation and a restart of Firefox.

https://addons.mozilla.org/en-US/firefox/addon/printpdf/?src=api

"PrintPDF 0.76 by Pavlov"

I actually built up (compiled from source) v3.6.28 in Visual C++ 2005 Express
on a Win2K virtual machine. (I was using that for debugging.) When I
added "PrintPDF 0.76 by Pavlov", I was seeing an Assert failure ("float
manager state") from v3.6.28. I don't expect that affects the real Firefox,
but it was curious nonetheless. I think that add-on is still worth a try. My
debug build emits output into an MSDOS-like window as it runs, and that
plus the debugger in the IDE is what I was using to watch how it works.
Too damn complicated to figure out how it works with a debugger though
(like, how the print architecture actually works, instead of my guess).

If anyone else wants to make a debug build of Firefox, this is the
"mozconfig" file I used in the mozilla-1.9.2 folder. The disable-ipc
was added to stop the build from breaking, in some code hooks for
connecting a debugger and collecting a stack trace of some sort.
The parallel compilation is set to "j1" since my virtual environment
only has one computing core.

mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-@CONFIG_GUESS@
mk_add_options MOZ_MAKE_FLAGS="-j1"
ac_add_options --with-windows-version=502
ac_add_options --enable-debug
ac_add_options --enable-application=browser
ac_add_options --disable-ipc

http://img16.imageshack.us/img16/1456/v3628running.gif

Have fun,
Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top