Utility Harvest http links

A

Ardent

X-No-Archive: yes

Hi all

Thanks to acf I chanced on Textbomb.exe that does a wonderful job of
making a list of all the e-mail ids in a file.

I have several files with a number of http links in each and other
information and I have to cut and paste each link in a separate file
to get at all the links. It is quite laborious, :-(

Is there a utility similar to Textbomb that can collect the http
links?

Thanks for your time and attention.
 
S

Susan Bugher

Ardent said:
I have several files with a number of http links in each and other
information and I have to cut and paste each link in a separate file
to get at all the links. It is quite laborious, :-(

Is there a utility similar to Textbomb that can collect the http
links?

I use JetLinks to extract links.

Program: JetLinks
Author: Dahlhoff IT-Solutions
Ware: (Freeware)
http://www.manfred-dahlhoff.de/
http://www.dahlhoff-it.com/resources/jlsetup-1.2.exe
Version 1.2.0.5 of JetLinks. Includes english and german program languages
(2555 KB)

more apps here:
http://www.pricelesswarehome.org/acf/P_INTERNET.php#4.91URL:Extract

Susan
--
Posted to alt.comp.freeware
Search alt.comp.freeware (or read it online):
http://www.google.com/advanced_group_search?q=+group:alt.comp.freeware
Pricelessware & ACF: http://www.pricelesswarehome.org
Pricelessware: http://www.pricelessware.org (not maintained)
 
A

Ardent

X-No-Archive: yes

I use JetLinks to extract links.

Thanks Susan

Unfortunately all these programs extract links only from web pages and
book mark files. What I need is to get them off a file that has the
links as well as lot of binary data.
 
S

Susan Bugher

Unfortunately all these programs extract links only from web pages and
book mark files. What I need is to get them off a file that has the
links as well as lot of binary data.

Try saving a copy of your file as plain text or copy the file to the
clipboard. Jetlinks can import URLs from text files or monitor the
clipboard for URLs. From the JetLinks help file:

<q>
You can monitor the clipboard for URLs by setting the Monitor clipboard
for URLs option in the Options dialog. JetLinks will automatically
detect if there is any text on the clipboard that looks like a URL and
show a dialog containing the URLs found.
</q>

<q>
Import URLs from text file
This function can import URLs from any text file. Suppose you have a
text file containing the following (e.g. from a mail message):

<SNIP>

Now, there are several URLs somewhere in this text. They are even in
different format, some having the http:// (or ftp://) prefix, some not.
Using this function on the text file will show the following results dialog:

As you can see, all the different formats of URLs in the text have been
recognized and even the mailto-link in the message's footer is listed.
Now you can bookmark the selected URLs. You'll have to change the titles
manually, though, because the import function can't find any titles in a
usual text file, of course. The import action will be preselected
according to several criteria, please read the detailed explanation in
the Grab URLs topic for more information on this.
</q>

Susan
--
Posted to alt.comp.freeware
Search alt.comp.freeware (or read it online):
http://www.google.com/advanced_group_search?q=+group:alt.comp.freeware
Pricelessware & ACF: http://www.pricelesswarehome.org
Pricelessware: http://www.pricelessware.org (not maintained)
 
W

w4tch3r

.... What I need is to get them off a file that has the
links as well as lot of binary data.

This works for me every time using text fils. May also work for binary
files depending on how well you can define the character sets used in
the links.

1. Get Regex Power! from http://www.ware4u.de/regexpower/download.html
2. Using Regex Power! open your file
3. Use this regex search: http[^ "]+
Within the search dialog, select "create new file"
Use the "List Matchings" button instead of the "Search" button

This will open a new window with all your links inside. I have assumed
that the links begin with http and end with either a space or a ".
Using regular expressions is complicated but powerful. Here's an
alternate regex search that I use all the time on files. It picks out
http/ftp links for all exe/zip/rar/gz.

regex search:
[fht]#tp:.#{\.exe}|{\.zip}|{\.rar}|{\.gz}

this ones a bit more generic:
[fht]#tp://[a-zA-Z0-9_\.\/\-]+

Regex Power! has the best free regular expression engine I have seen in
freeware (I've tested a few others also). Yet I don't recall ever
seeing it mentioned in ACF! My only complaint is that I can't get the
"^" filter working (beginning of line). Maybe a cr/crlf thing. Most of
the site in German (but not the Regex Power! pages), program is in
English.

W4tch3r =3F=3F¿=3F=3F
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top