Does anyone know of a PDF-to-something else utility ?

  • Thread starter Thread starter Al Dykes
  • Start date Start date
A

Al Dykes

I want to dump a 7MB PDF file to plain ascii text or MS word format or
some other common format that I can convert.

This is a file created with a very recent version of acrobat and the
and it's protected against selecting any or all of the text so I can
copy and paste. The Copy menu item is always greyed-out.

I've downloaded a couple shareware PDF converters but they say
"invalid format" which I guess means they haven't been updated to the
current spec.

OT: If anyone knows any linux software to do this I can use that too.
 
If you have a scanner, you could scan it and use the OCR
software to get text. Or spend $600 and buy the full
Acrobat program.


--
The people think the Constitution protects their rights;
But government sees it as an obstacle to be overcome.
some support
http://www.usdoj.gov/olc/secondamendment2.htm



|
| I want to dump a 7MB PDF file to plain ascii text or MS
word format or
| some other common format that I can convert.
|
| This is a file created with a very recent version of
acrobat and the
| and it's protected against selecting any or all of the
text so I can
| copy and paste. The Copy menu item is always greyed-out.
|
| I've downloaded a couple shareware PDF converters but they
say
| "invalid format" which I guess means they haven't been
updated to the
| current spec.
|
| OT: If anyone knows any linux software to do this I can
use that too.
|
|
|
| --
| a d y k e s @ p a n i x . c o m
|
| Don't blame me. I voted for Gore.
 
If you have a scanner, you could scan it and use the OCR
software to get text. Or spend $600 and buy the full
Acrobat program.


Did you notice I said it's a 7MB pdf, all text. It's thousands of
pages. :-( Thanks
 
Al said:
Did you notice I said it's a 7MB pdf, all text. It's thousands of
pages. :-( Thanks

Yes, and it's copy-protected. Why not contact the people who wrote the
document and ask for permission to copy it (or whatever your end goal
with the document is)? I'm sorry, but I don't know of any way to break
the copy protection on the file.

Malke
 
Linux! Gore! and now pdf.... you do need help.<G>

I've got PDF Tools which may be able to extract the text. 'Don't know if it
can get by password protection but I'm willing to try.

You may send me the file and I'll give it a try. You may want to ZIP the
file if you can.

My real e-mail address is:

dschmidtATpacifierDOTcom <== change the obvious
 
Al said:
I want to dump a 7MB PDF file to plain ascii text or MS word format or
some other common format that I can convert.

This is a file created with a very recent version of acrobat and the
and it's protected against selecting any or all of the text so I can
copy and paste. The Copy menu item is always greyed-out.

I've downloaded a couple shareware PDF converters but they say
"invalid format" which I guess means they haven't been updated to the
current spec.

OT: If anyone knows any linux software to do this I can use that too.

Open the PDF file in Acrobat Reader and choose File... Save as Text. Or...
click on Edit...Copy file to clipboard. Now paste it in Word, Wordpad, etc.
(sometimes you lose the clipboard copy when you later open Word, etc so open
the program you wish to paste to before you Copy file to clipboard in
Acrobat.)
 
Linux! Gore! and now pdf.... you do need help.<G>

I've got PDF Tools which may be able to extract the text. 'Don't know if it
can get by password protection but I'm willing to try.


it's copy/paste protectedn, not password protected.

it's a couple thousand pages of text and I just want to clip the
occasional sentance out for a citation. And it's text, not acrobat'd
images.



You may send me the file and I'll give it a try. You may want to ZIP the
file if you can.

My real e-mail address is:

dschmidtATpacifierDOTcom <== change the obvious
 
Open the PDF file in Acrobat Reader and choose File... Save as Text. Or...
click on Edit...Copy file to clipboard. Now paste it in Word, Wordpad, etc.
(sometimes you lose the clipboard copy when you later open Word, etc so open
the program you wish to paste to before you Copy file to clipboard in
Acrobat.)



Copy is greyed out in all cases. So is save as text.

It's NOT password protected.

If any of you want to try it, d/l the 9-11 Commission Report (7MB)
from here. This is what I want to clip citations from.

http://www.9-11commission.gov/report/911Report.pdf

Their concern is legit. If I could edit it and make a new PDF it
would allow bogus versions to circulate. If I copy it to flat ascii
there is no way I could recreate the pagination, etc in a new PDF.
 
Including Google's cache of the PDF (ie View as HTML as it's called for PDFs)?

Are you allowed to print? If so print to the Text Only printer and send to File.
 
I found it at google and can View As HTML. But IE struggles with it. It also just deleted everything I typed.

BAN FU*KING PDFS (I've had this problem before)./

The collapse of the World Trade Towers (WTC), in the aftermath of the 9-11terrorist attacks, is one of the most catastrophic political, economic, and social disasters to ever occur in America. Ramifications from the collapse are both far reaching, and long lasting, and are expected to have an impact on the lives of the citizens of the United States, and indeed of the world, for the foreseeable future. Sadly, the death of 26541innocent individuals in Manhattan was celebrated as a great victory by supporters of Al Qaeda around the world. There is little doubt that further innocent deaths related to 9-11 would add to the prestige andinfluence of Al Qaeda, increasing their ability to recruit new martyrs and to continue and even expand their attacks on the West. It is imperative that we prevent further deaths related to 9-11 from happening. But are future deaths and injuries likely as a result of the terrorist attack of 9-11? Can we establish that contamination from the dust of the collapse of the towers might be the cause of future health problemsamong residents andrescue workers?We know that the graycloud which covered lower Manhattan on 9-11, andthe immediate aftermath,contained a variety of substances that are usually injurious to human health. These included traces of glass shards, asbestos, fiberglass, pulverized concrete, lead, mercury, cadmium, dioxins, PCB’s, polycyclic aromatic hydrocarbons as well as benzene.2Significant media attention has already been directed at the short term health impact of exposure to this concoction. It now appears that many first responders, cleanup crews and local residents have fallen ill with a variety of 1“The World Trade Center” Sad News Dot Net Weekly2“Fallout”, Jennifer Senior, New Yorker Magazine, September 20,2004, P 4Figure 2: The 9-11 dust cloud advances – Photo Bill Biggart © 2001
 
Including Google's cache of the PDF (ie View as HTML as it's called for =
PDFs)?

Are you allowed to print? If so print to the Text Only printer and send =
to File.

Yea, I'll just install an ASCII printer. I've done that before.

Thanks
 
My computer does the work, not me. :-)
After the conversion you could find what you want with a find.

It would be a good test for the pdf program I brag about, PDF-XChange.

'Still willing to give it a try.


--
Don
Vancouver, USA


Al Dykes said:
Linux! Gore! and now pdf.... you do need help.<G>

I've got PDF Tools which may be able to extract the text. 'Don't know if
it
can get by password protection but I'm willing to try.


it's copy/paste protectedn, not password protected.

it's a couple thousand pages of text and I just want to clip the
occasional sentance out for a citation. And it's text, not acrobat'd
images.
 
Al Dykes said:
[QUOTE="Darrell S said:
I want to dump a 7MB PDF file to plain ascii text or MS word format or
some other common format that I can convert.

This is a file created with a very recent version of acrobat and the
and it's protected against selecting any or all of the text so I can
copy and paste. The Copy menu item is always greyed-out.

I've downloaded a couple shareware PDF converters but they say
"invalid format" which I guess means they haven't been updated to the
current spec.

OT: If anyone knows any linux software to do this I can use that too.

Open the PDF file in Acrobat Reader and choose File... Save as Text.
Or...
click on Edit...Copy file to clipboard. Now paste it in Word, Wordpad,
etc.
(sometimes you lose the clipboard copy when you later open Word, etc so
open
the program you wish to paste to before you Copy file to clipboard in
Acrobat.)



Copy is greyed out in all cases. So is save as text.

It's NOT password protected.

If any of you want to try it, d/l the 9-11 Commission Report (7MB)
from here. This is what I want to clip citations from.

http://www.9-11commission.gov/report/911Report.pdf

Their concern is legit. If I could edit it and make a new PDF it
would allow bogus versions to circulate. If I copy it to flat ascii
there is no way I could recreate the pagination, etc in a new PDF.
[/QUOTE]

If you just want to copy, not modify, and an image copy would work then you
can:

Use Alt-Print Screen and copy the open window to the clipboard and paste
to an image editing program and crop the image to what you want.

Use Microsoft OneNote 2003 Screen Clipping feature. You just start
screen clipping and use your mouse to highlight the text you want. I check
function on your link and it did work on this document. Note: the screen
clip results in a GIF image.

I suspect {but I am not sure} that the problem you are having in converting
the 911.pdf file to Word is a limitation of the converter you are using or
perhaps of the version of Word your are using. I just converted 585 pages
of my {not the one you linked to} 911.pdf file to Word 2003 using Scansoft
PDF Converter 3 with no problem. A quick check {not all pages} indicates a
accurate copy.

Don
 
Linux! Gore! and now pdf.... you do need help.<G>

I've got PDF Tools which may be able to extract the text. 'Don't know if it
can get by password protection but I'm willing to try.

You may send me the file and I'll give it a try. You may want to ZIP the
file if you can.

My real e-mail address is:

dschmidtATpacifierDOTcom <== change the obvious

Look for an email from me. Thanks
 
My computer does the work, not me. :-)
After the conversion you could find what you want with a find.

It would be a good test for the pdf program I brag about, PDF-XChange.

'Still willing to give it a try.


I can report that Don's software was able to extract the ascii in my
pdf when no other tool I demo'd could. It also extracted the imbedded
images into seperate files which I have not looked at yet.

Thanks for the test drive. His web site describes his software.

http://www.docu-track.com/

I have no association with Don other than taking him up
on his offer to test it against my pdf file.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top