REQ: CHM files

O

omega

S

Susan Bugher

omega said:
Load each of these files with Mozilla.

http://www.redshift.com/~omega/2004/tmp/file01.htm
http://www.redshift.com/~omega/2004/tmp/file02.htm

Then do a File > Save As. Then look at the change.

No cheating allowed - no doing a get link retrieve like you do for a zip.
We're talking about the standard way people save pages with their browser,
using the Save As dialog.

original:

<html><head></head>
<html><body>
<a href="file01.htm">file01.htm</a>
</body></html>

saved:

<html><head></head>
<html><body>
<a href="file01.htm">file01.htm</a>
</body></html>

Susan
 
O

omega

Susan Bugher said:
original:

<html><head></head>
<html><body>
<a href="file01.htm">file01.htm</a>
</body></html>

saved:

<html><head></head>
<html><body>
<a href="file01.htm">file01.htm</a>
</body></html>

You say that you:

1. loaded file02.htm, so that it was displayed in your browser
2. Clicked File > Save As

???

I have never seen a browser not write in absolute paths (Mozilla 1.6 works
normally here, as did the Netscape 2-4 series I used in old days). The
browser has good reason to do the writes, too. So that when you click on
links on saved pages that happen to be from the same domain, those links
are then functional.
 
O

omega

omega said:
You say that you:

1. loaded file02.htm, so that it was displayed in your browser
2. Clicked File > Save As

???

I have never seen a browser not write in absolute paths (Mozilla 1.6 works
normally here, as did the Netscape 2-4 series I used in old days). The
browser has good reason to do the writes, too. So that when you click on
links on saved pages that happen to be from the same domain, those links
are then functional.

I think I see what you're doing. Not a general save as Web Page, but instead
a textual option. Most people are going to be choosing the save as Web Page,
and any of the various other types of saves, such as straight HTML, those
vary tremendously from browser to browser. Even within family. Kmeleon
differs slightly from Moz. IExplorer uses the one dialog, and then the
other browsers that host MSIE, for instance Maxthon, they sometimes add
in other types of saves.

It is the Save As Web page that is universal, as it is -normally- the
most
appropriate.

If you really want to keep that advice in the readme, then there would
be some amount of work to do in getting browser-specific instructions
for saves that work the way you would plan. And not all browsers will
provide that type of save. I'd have to check, but under my immediate
suspicion in that regard -would be iexplorer, ob1, 1x browser.
 
S

Susan Bugher

omega said:
You say that you:

1. loaded file02.htm, so that it was displayed in your browser
2. Clicked File > Save As

???

I have never seen a browser not write in absolute paths (Mozilla 1.6 works
normally here, as did the Netscape 2-4 series I used in old days). The
browser has good reason to do the writes, too. So that when you click on
links on saved pages that happen to be from the same domain, those links
are then functional.

There must be an *option* somewhere in preferences. When I save there is
*no* change.

You sure are hard to convince. When I made the PL2003 pages way back
when I told everyone they could saves a local copy and the links would
work locally. To date you are the only person to claim that isn't so. . .

Susan
 
O

omega

Susan Bugher said:
There must be an *option* somewhere in preferences. When I save there is
*no* change.

Mozilla 1.6 behavior with that dialog is to default to the type of save
last selected. So if the user's last save was as "Web Page," that's
what will show up. And the other way around. IExplorer is hard-wired.
It always defaults to the Save As Web Page. (Unless somehow changed after
5.5. Seriously doubt.) That Save As dialog box, it is universal with the
whole MSIE family. Maxthon, and perhaps others hosting MSIE, add their
own additional choices elsewhere on the File menu, but nevertheless the
central one is the MSIE Save As Web Page. It is the most commonly used.
You sure are hard to convince. When I made the PL2003 pages way back
when I told everyone they could saves a local copy and the links would
work locally. To date you are the only person to claim that isn't so. . .

Well: How many people did the saves, and tested them, and thought about
them? Far from stats, there...
 
S

Susan Bugher

Look again. ;)

NS 4.78 has three options for save: HTML, text, all files. all three
options saved your file this way:

<html><head></head>
<body>
<a href="file02.htm">file02.htm</a>
If you really want to keep that advice in the readme, then there would
be some amount of work to do in getting browser-specific instructions
for saves that work the way you would plan. And not all browsers will
provide that type of save. I'd have to check, but under my immediate
suspicion in that regard -would be iexplorer, ob1, 1x browser.

I think that's piffle. Saving a text file is not rocket science. FWIW I
normally use save as HTML - that works just fine for HTML and PHP pages.

Susan
 
O

omega

Susan Bugher <[email protected]>:

There is another problem related to instructions. The MSIE browsers, when
they offer a filename into the Save As dialog, those put in whatever they
read from a page's title tag. So for all the MSIE users, an additional step
besides changing to the textual save type, their procedure involves going
to the address bar, and copying the filename part of the URL, for paste
into the Save dialog. (And not doing so when there is the php?query=string
thingy.)
 
O

omega

Susan Bugher said:
NS 4.78 has three options for save: HTML, text, all files. all three
options saved your file this way:
<a href="file02.htm">file02.htm</a>

All right. Proves me senile there. I have become used to the writing of the
paths, since it is quite frequently desired. Yet now you make me realize
that probably browsers probably didn't gain that sophistication until
somewhere in the past 3-4 years or so.
I think that's piffle. Saving a text file is not rocket science. FWIW I
normally use save as HTML - that works just fine for HTML and PHP pages.

Well, I think it's paffle, not piffle.
 
S

Susan Bugher

omega said:
Well, I think it's paffle, not piffle.

it's something. . . ;)

anyway, aha moment here. . .

Your *assumption* that everybody always saves all web pages as "Web
page, complete" explains why you couldn't see the advantage of using the
..php files extensions in the ZIP archives.

The ZIPs make it easier to save the site locally. People can *update*
their copies of the web pages whenever they want to.

Susan
 
O

omega

Susan Bugher said:
Your *assumption* that everybody always saves all web pages as "Web
page, complete" explains why you couldn't see the advantage of using the
.php files extensions in the ZIP archives.

I had no idea that was your reason for wanting the filenames.php etc.
Until late last night when seeing your readme about telling users to
save from browser to update their archives.

Whether the normal user, statistically an MSIE user, will be able to
figure out all the extra steps they'd need to do, about not using the
normal Save > OK click, that's our debate. We'll never have a survey
from these figurative users.

But we do gain one thing, you and I: The number of posts on the argument.
Jo ought be jealous, with his post in the FREEWARE thread, a count that
was left far in the dust compared to this one.
The ZIPs make it easier to save the site locally. People can *update*
their copies of the web pages whenever they want to.

I tried those settings, the ones that took me four posts to remember about,
on a the 2005 directory, and it looked fine.

From the surface. It will take concentration, from there, to figure out
strategy next step for the remaining difference, where there are the set
of local html files offline, and the one filename.php?query for URL online.

On another item. I think you ought also consider using the httrack footer
string. The one that gives a link, when browsing an offline page, to load
up the correspondent current page online.
 
S

Susan Bugher

omega said:
From the surface. It will take concentration, from there, to figure out
strategy next step for the remaining difference, where there are the set
of local html files offline, and the one filename.php?query for URL online.

It would be nice to have them in the PL2005 CD. I think it's perhaps
better to leave the extra pages for sorting out of the *online* zips.
On another item. I think you ought also consider using the httrack footer
string. The one that gives a link, when browsing an offline page, to load
up the correspondent current page online.

I'll take a look at that option. I had figured out the sequential nature
of the include/exclude strings. I'm also going to look again at your
info about:

[BUILD] =========================
User-Defined Structure: %p/%n.%t

That's beginning to make more sense to me now.

More later - after further experimentation. We *have* done a very good
job of hijacking this thread. ;)

Susan
 
O

omega

Susan Bugher said:
[BUILD] =========================
User-Defined Structure: %p/%n.%t

That's beginning to make more sense to me now.

Text Grabber pulls out the text of the definitions:

Build > User-Defined Structure > Options

%n Name of file without file type (ex: image)
%N Name of file including file type (ex: image.gif)
%t File type only (ex: gif)
%p Path [without ending /] (ex: /someimages)
%h Host name (ex: www.someweb.com)
%M MD5 URL (128 bits, 32 ascii bytes)
%Q MD5 query string (128 bits, 32 ascii bytes)
%q MD5 small query string (16 bits, 4 ascii bytes)

%s? Short name (ex: %sN)

Example: %h%p/%n%q.%t
-> c:\mirror\www.someweb.com\someimages\image.gif

F(or a long time, I'd had an MD5 string in there, got there from some
kind of Httrack defaults. I didn't know what was up. But it was one
of the causes where my earlier history with Httrack was tainted by
constantly getting really ugly-lookin' filenames...)
More later - after further experimentation. We *have* done a very good
job of hijacking this thread. ;)

The OP got his whole two posts in. Before the landslide hit.
 
S

Susan Bugher

More later - after further experimentation.

??? All I could find was a *hidden* footer string.

I've concluded it's not possible to set rules that will create: same PHP
file name, relative links etc. etc. for sortable file "sets".

I did find a way to get HTTrack to give me the right table heading in
the *primary* sortable file (but not in the additional files). I had to
*download* the additional files to get the right string in the *primary*
file. :(

I could copy the table headings and paste them into files downloaded
with the "orig. URL/ orig. URL" option. ISTM that the game isn't worth
the candle. . .

I expect I'll be doing the final uploads, downloads and ZIP file for
Mark *just* before Mark creates the CD. Using the "orig. URL/ orig. URL"
option allows me to download all the web pages in one fell swoop - takes
around 10 minutes on my dial-up connection - ISTM that's going to be the
best option.

Susan
 
O

omega

Susan Bugher said:
??? All I could find was a *hidden* footer string.

I've concluded it's not possible to set rules that will create: same PHP
file name, relative links etc. etc. for sortable file "sets".

http://www.redshift.com/~omega/pw/pw2005-httrack.zip (1mb)

\pw2005-httrack.zip\2005\

The local files sort fine, and the links are relative....
I did find a way to get HTTrack to give me the right table heading in
the *primary* sortable file (but not in the additional files).

The "right" table heading? It is necessarily different on a local archive
in this situation. In the local, the created individual html files are being
pointed to. In the online, it's the query string, to interact with the PHP
script.

local
<TH><A HREF="PL2005ProgramIndex-3.php?sortby=Category">Category</A></TH>
<TH><A HREF="PL2005ProgramIndex-4.php?sortby=Author">Author</A></TH>

online
<TH><A HREF="PL2005ProgramIndex.php?sortby=Category">Category</A></TH>
I had to *download* the additional files to get the right string in the
*primary* file. :(

A local file with links all saying PL2005ProgramIndex.php[?sortby=xx], what
use would it serve? It would all be pointing to the same page, not represent
the unique content, the sorted order, that one expected.

To achieve what you'd want, you'd have to change the php script, so that
it gens like httrack does, making unique file names for requested pages.
I've zero idea if that's possible.

I do have the opinion that it's not that important...

I should think that those who would download an offline archive, they'd not
modify it. Simpler to get a complete updated archive when interested. And,
for particular questions where they want to verify the very most recent
content, they could click the offline page's link to jump online to the
correspondent page on server.

.. . .

By the way, you'd earlier expressed interest in making a brief archive, one
that only gets the initial page of a set, and not the extra ones, which have
various sort choices applied.

In the same zip refd, see this other download:

\pw2005-httrack.zip\2005-bref\

This has a brief set of pages. Then for the links to get those sorted
tables, those point online:

<TH><A
HREF="http://www.pricelesswarehome.org/2005/PL2005ProgramIndex.php?sortby=Category">Category</A></TH>
<TH><A
HREF="http://www.pricelesswarehome.org/2005/PL2005ProgramIndex.php?sortby=Author">Author</A></TH>



___________________________________________________________
PS. My laptop keyboard seems to have self-destructed. So I might be looking
at some down-time (little to no posting), until I get some shopping done.
 
O

omega

Susan Bugher said:
Using the "orig. URL/ orig. URL"
option allows me to download all the web pages in one fell swoop -

Problem: that gives you a lot of extra pages, ones that are not linked
together for sortable tables. Due to all those table links pointing at
the same file (eg all to PL2005ProgramIndex.php).

If you indeed want to take the direction of not providing the sorted tables.
Then maybe your plan could be to distribute a commandline page retriever +
a batch file with the known targets.

: url2file http://www.pricelesswarehome.org/2005/PL2005ProgramIndex.php >PL2005ProgramIndex.php
(etc)

Url2file.exe, it's about 450k. I have another exec, seems to work about
the same, named graburl.exe, and it's 34k. Besides those two, I've other,
similar commandline remote-page retrievers hanging around. Not really
evaluated differences, one from the next. (There is also the famous WGET;
but I have the impression that one's much more complex -- appears to need
several supporting dlls.)
 
S

Susan Bugher

omega said:
http://www.redshift.com/~omega/pw/pw2005-httrack.zip (1mb)

\pw2005-httrack.zip\2005\

The local files sort fine, and the links are relative....

I just downloaded the zip, looked at a couple of pages and see this:

<A HREF="http://www.pricelesswarehome.org/acf/Index.php">acf
Information</A></b> </TD> <TD width="20%" align="center"><b>

should be <A HREF="../acf/Index.php">

<A HREF="http://www.pricelesswarehome.org/acf/Members.php">acf Members
Sites</A>

should be <A HREF="../acf/Members.php"
The "right" table heading?

This table heading (for PL2003AlphabeticalList.php):

<TH><A HREF="PL2003AlphabeticalList-2.php">Program</A></TH>
<TH><A HREF="PL2003AlphabeticalList-3.php">Category</A></TH>
<TH><A HREF="PL2003AlphabeticalList-4.php">Author</A></TH>
<TH><A HREF="PL2003AlphabeticalList-5.php">Ware_type</A></TH>
<TH><A HREF="PL2003AlphabeticalList-6.php">DescRev</A></TH>

But the URLs that aren't in the 2003 subdirectory are *not* relative.
That result is *not* the result I want.

I'm back to mulling it over. I do have an advantage over most people. I
can change HTTRack settings *and/or* I can revise the web pages. ;)

Susan
 
O

omega

Susan Bugher said:
I just downloaded the zip, looked at a couple of pages and see this:

<A HREF="http://www.pricelesswarehome.org/acf/Index.php">acf
Information</A></b> </TD> <TD width="20%" align="center"><b>

should be <A HREF="../acf/Index.php">

<A HREF="http://www.pricelesswarehome.org/acf/Members.php">acf Members
Sites</A>

should be <A HREF="../acf/Members.php"

Add filters to make those pages part of the local archive. Eg,

+http://www.pricelesswarehome.org/acf/Members.php
+http://www.pricelesswarehome.org/acf/members/*
 
O

omega

Susan Bugher said:
But the URLs that aren't in the 2003 subdirectory are *not* relative.
That result is *not* the result I want.

I don't really follow, but anyway, will mention one item. You can start
from a full mirror of the PW site, thus all your links will be relative
in that mirror. And then you can copy the individual directories from
there, for whichever purpose(s) you envision.
 
O

omega

Susan Bugher said:
I'm back to mulling it over. I do have an advantage over most people. I
can change HTTRack settings *and/or* I can revise the web pages. ;)

An advantage, that you can approach it from both ends.
Or a disadvantage: more available work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top