NoteTab : Joining lines - bug or feature ?

J

John Fitzsimons

Hi,

Following on from a recent post here :

When I go Modify - lines - join lines the results seem inconsistent.
For example, suppose I start with ;

tom
dick
harry

Doing the above results in ;

tom dick harry

Okay, the line returns have been replaced with a space.

Now suppose I start with

http://www.eclipse.org/
http://www.eclipse.org/downloads/index.php

Doing the above results in ;

http://www.eclipse.org/http://www.eclipse.org/downloads/index.php

Okay, the line return has NOT been replaced with a space.

Is that a bug or a feature ?

When are spaces inserted and when aren't they ?

Comments anyone please ?

Regards, John.

--
****************************************************
,-._|\ (A.C.F FAQ) http://clients.net2000.com.au/~johnf/faq.html
/ Oz \ John Fitzsimons - Melbourne, Australia.
\_,--.x/ http://www.aspects.org.au/index.htm
v http://clients.net2000.com.au/~johnf/
 
C

Christopher Jahn

And said:
Hi,

Following on from a recent post here :

When I go Modify - lines - join lines the results seem
inconsistent. For example, suppose I start with ;

tom
dick
harry

Doing the above results in ;

tom dick harry

Okay, the line returns have been replaced with a space.

Now suppose I start with

http://www.eclipse.org/
http://www.eclipse.org/downloads/index.php

Doing the above results in ;

http://www.eclipse.org/http://www.eclipse.org/downloads/inde
x.php

Okay, the line return has NOT been replaced with a space.

Is that a bug or a feature ?

A feature.

Long urls often get broken up into seperate lines by some text
readers. NoteTab is restoring the "broken url".
 
F

Frank Bohan

John Fitzsimons said:
Hi,

Following on from a recent post here :

When I go Modify - lines - join lines the results seem inconsistent.
For example, suppose I start with ;

tom
dick
harry

Doing the above results in ;

tom dick harry

Okay, the line returns have been replaced with a space.

Now suppose I start with

http://www.eclipse.org/
http://www.eclipse.org/downloads/index.php

Doing the above results in ;

http://www.eclipse.org/http://www.eclipse.org/downloads/index.php

Okay, the line return has NOT been replaced with a space.

Is that a bug or a feature ?

When are spaces inserted and when aren't they ?

Comments anyone please ?

John: I think the "feature" of not inserting a space in the case of URLs is
included so that URLs that have been spread over two (or more) lines will
not be incorrectly joined with a space included. Admittedly this goes a
little awry when there are two separate URLs but generally speaking it seems
a good idea.

===

Frank Bohan
¶ Do not take life too seriously; you will never get out if it alive.
 
J

John Fitzsimons

And it came to pass that John Fitzsimons wrote:
A feature.
Long urls often get broken up into seperate lines by some text
readers. NoteTab is restoring the "broken url".

Well, I saw that as a possibility but that still doesn't quite answer
my question "When are spaces inserted and when aren't they ?"

Would "long URLs" only be a http address ? Or include ftp... gopher...
www.... etc.

I guess the main questions I would like some answer to is when, in
what circumstances, do most text editors, and most newsreaders, split
a line in a position other than a blank space ?

The reason this issue is important is that if one wants to "grep" web
address' from eg. concatated newsgroup posts lists the splitting of
lines would interfere if done incorrectly.

Regards, John.
 
R

REMbranded

John Fitzsimons <[email protected]> wrote:
Well, I saw that as a possibility but that still doesn't quite answer
my question "When are spaces inserted and when aren't they ?"
Would "long URLs" only be a http address ? Or include ftp... gopher...
www.... etc.

ftp:// has the space, where http:// does not.

www has the space, where http://www does not.

You might be able to replace http:// with nothing and get
satisifaction.
I guess the main questions I would like some answer to is when, in
what circumstances, do most text editors, and most newsreaders, split
a line in a position other than a blank space ?

If the editor does not recognize links it can wrap at any point. You
might find PsPad useful also.

http://www.pspad.com/index_en.html

If you save your text file with a .html extension you can use the
tools in the HTML tab of the program. One is to compress. It wraps
lines at 1024 characters and spaces the URL's. This should make a file
you can grep away with? The wrap length is configurable under
Tools/Program Settings. The compressed file can be saved and exported
as RTF for use with NotePad or other editors with syntax highlighting.

The more I look at this program the more I like it. Very nice!
The reason this issue is important is that if one wants to "grep" web
address' from eg. concatated newsgroup posts lists the splitting of
lines would interfere if done incorrectly.

BTW, as long as you're playing with NoteTab Light. I found a setting
where you can set the maximum file size. It was set at the default max
of 2 gigs.

However, the whole file does not load for 50 and 100 meg files. I
searched for "end," which is at the end of each file and it does not
find them. I scroll down and it just stops at some point without
reloading the rest of the file from the swap file or from disk.

Can you give it a go if you still have some large files and see just
how much you can load and work with?

As it stands, it looks like 10 megs is the max file size NoteTab Light
can work with (here), even though there is a specific setting that
goes to 2 gigs.
 
J

John Fitzsimons

ftp:// has the space, where http:// does not.
www has the space, where http://www does not.

If I had thought this through I might have been able to work out some
tests myself. Thanks for the feedback.
You might be able to replace http:// with nothing and get
satisifaction.
If the editor does not recognize links it can wrap at any point.

Ah ! VERY good point. :)

Though I hope things wouldn't wrap mid word. If that happened I
suppose a "-" would be inserted ? Can a "-" be part of a web address ?
If so then removing all dashes in a re-flowed text could much them up.
You might find PsPad useful also.

Already one of the frighteningly large number of text editors I now
have. It's hard enough remembering their names let alone what each
one does !
If you save your text file with a .html extension you can use the
tools in the HTML tab of the program. One is to compress. It wraps
lines at 1024 characters and spaces the URL's. This should make a file
you can grep away with? The wrap length is configurable under
Tools/Program Settings. The compressed file can be saved and exported
as RTF for use with NotePad or other editors with syntax highlighting.

Thanks. Very handy to know. I was looking at a similar approach with
Barry's Emacs but it doesn't seem to have that as a "default" setting.
The more I look at this program the more I like it. Very nice!
BTW, as long as you're playing with NoteTab Light. I found a setting
where you can set the maximum file size. It was set at the default max
of 2 gigs.

I am using the pro version. I don't know where that setting is.
However, the whole file does not load for 50 and 100 meg files. I
searched for "end," which is at the end of each file and it does not
find them. I scroll down and it just stops at some point without
reloading the rest of the file from the swap file or from disk.
Can you give it a go if you still have some large files and see just
how much you can load and work with?

I did try a 20MB+ file with the above a while ago and I found it so
unsatisfactory I decided to try other editors instead. IIRC it gave me
an "out of memory" error which some of the other editors did NOT
give me.
As it stands, it looks like 10 megs is the max file size NoteTab Light
can work with (here), even though there is a specific setting that
goes to 2 gigs.

I didn't work out what the maximum was here. Only that it wasn't
enough.

Regards, John.

--
****************************************************
,-._|\ (A.C.F FAQ) http://clients.net2000.com.au/~johnf/faq.html
/ Oz \ John Fitzsimons - Melbourne, Australia.
\_,--.x/ http://www.aspects.org.au/index.htm
v http://clients.net2000.com.au/~johnf/
 
R

REMbranded

Ah ! VERY good point. :)
Though I hope things wouldn't wrap mid word. If that happened I
suppose a "-" would be inserted ? Can a "-" be part of a web address ?
If so then removing all dashes in a re-flowed text could much them up.

I'm not sure about the dash. The link may just wrap, as long links do
here when the reader is set to a width less than the link length.
Already one of the frighteningly large number of text editors I now
have. It's hard enough remembering their names let alone what each
one does !

Thanks. Very handy to know. I was looking at a similar approach with
Barry's Emacs but it doesn't seem to have that as a "default" setting.

Give PsPad a shot, one click to compress and wrap and another to
return to the original form. Both are in the HTML tab. This doesn't do
the space from non-httP and no space for http. It spaces everything
quickly and neatly.

I found a couple that do a nice job of stripping >, >> from text files
too if you're still in the market.
I am using the pro version. I don't know where that setting is.

Way down deep in the tabbed options.
I did try a 20MB+ file with the above a while ago and I found it so
unsatisfactory I decided to try other editors instead. IIRC it gave me
an "out of memory" error which some of the other editors did NOT
give me.
I didn't work out what the maximum was here. Only that it wasn't
enough.

I think that is enough to confirm 10 megs is about as large as you can
open and comfortably work with as a general rule. Thanks.
 
J

John Fitzsimons

I found a couple that do a nice job of stripping >, >> from text files
too if you're still in the market.

< snip >

Thanks, but a simple search and replace can strip all ">". I presume
you were meaning something different ? As others here might be
interested as well maybe you could explain what you meant a little
more ?

Bye the way, you are just the sort of person who might know the answer
to this question. Which of the following would one find in a web
address ?

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

I thought that getting rid of punctuation in a file, before text
manipulation of newsgroup posts, could make things "neater". But
I didn't want to destroy any web address'.

The ones that I see as a "no-no" are the full stop, the @ sign and the
tilde. Did I miss anything ? Anyone ?


Regards, John.
 
R

REMbranded

Thanks, but a simple search and replace can strip all ">". I presume
you were meaning something different ? As others here might be
interested as well maybe you could explain what you meant a little
more ?

Yes, but removing all will remove the end of other things like message
ID's and email affresses:
John Fitzsimons <[email protected]> wrote:
John Fitzsimons <[email protected] wrote:

as above. That might not affect anything for your purposes though.

If you mark the text, Ctrl/A, and "Modify/Email/Unquote Text" in
NoteTab Light it will only remove the >'s that start each line and
leave the text beginning neatly on column 1.
Bye the way, you are just the sort of person who might know the answer
to this question. Which of the following would one find in a web
address ?
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

? said:
I thought that getting rid of punctuation in a file, before text
manipulation of newsgroup posts, could make things "neater". But
I didn't want to destroy any web address'.
The ones that I see as a "no-no" are the full stop, the @ sign and the
tilde. Did I miss anything ? Anyone ?

It might be risky to remove these if you want unadulterated URL's.
 
J

John Fitzsimons

Yes, but removing all will remove the end of other things like message
ID's and email affresses:

Only if one removes > from other places than the beginning of lines. A
search and replace can remove ONLY those. Or one can use something
like Shalom Txt. With that built in.
John Fitzsimons <[email protected] wrote:
as above. That might not affect anything for your purposes though.
If you mark the text, Ctrl/A, and "Modify/Email/Unquote Text" in
NoteTab Light it will only remove the >'s that start each line and
leave the text beginning neatly on column 1.

Would be handy if it also removed lines starting eg. : .
Bye the way, you are just the sort of person who might know the answer
to this question. Which of the following would one find in a web
address ?
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
I'm pretty sure I've seen URL's with #, %, &, /, :, <, =, >, ?, or ~
in them.

:-(

< snip >

Regards, John.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Text editor with coloured text ? 10
"Virtual" o/s on one partition ? 13
GuiltWare ? 9
Joining html files ? 1
Font recognition ? 3
Mozilla Firefox 1.0 PR 18
Playing media file query. 2
HTML "tag" collection ? 4

Top