Unicode text and carriage returns

Greg Lovern · Aug 26, 2009

If I enter carriage returns in a cell (Alt + Enter), then save the
sheet from Excel as "Unicode Text" file type, then open the file, the
carriage returns are interpreted as delimiters.

In the Text Import Wizard, I've made sure that the only delimter is
Tab, which is the default when opening that file.

If I look at the file in Notepad, the Alt + Enter characters do not
look like tabs at all. I have other tabs in the file and they look
like tabs, but the Alt + Enter characters look different.

In Step 1 of Excel's Text Import Wizard, I've tried all the variations
I can think off -- UTF-7, UTF-8, 1252, US-ASCII.

In Step 2 of the Wizard, the text qualifier is """ (double-quote, the
default), and in Notepad I see that the whole cell contents, including
Alt+Enter carriage returns, are enclosed in doublequotes.

Any ideas? How can I save carriage returns in cells, save the sheet as
Unicode text, and import the text file without the carriage returns
being interpreted as delimiters?

Thanks,

Greg

Gary''s Student · Aug 26, 2009

You can cheat.

Before saving the file as Unicode Text, change all the embedded hard returns
( CHAR(10) ) into another character that you are not using (perhaps tilda).

When you re-import the file, the tildas will not be considered cell
delimiters. You can change them back into hard returns after the import.

Greg Lovern · Aug 26, 2009

If I enter carriage returns in a cell (Alt + Enter), then save the
sheet from Excel as "Unicode Text" file type, then open the file, the
carriage returns are interpreted as delimiters.

Correction -- they are interpreted as carriage returns, not
delimiters. So the next character goes to column A of the next row
down, not to the next cell to the right.

Still seems like an Excel bug, since the whole text in the cell is
enclosed in doublequotes when saving as Unicode Text.

For example, this is how it looks in the cell:

========================
First text row

Next text row
========================

This is how that part looks in Notepad after saving from Excel as
Unicode Text:

========================
"First text rowNext text row"
========================

Then when it is imported to Excel, the first part, "First text row",
is imported to the correct cell, then the next row of cells is blank,
then the "Next text row" part is in column A of the next row down.

Then, everything else in that row is in the wrong column, due to
starting the row over in column A.

Greg

Greg Lovern · Aug 26, 2009

You can cheat.

Before saving the file as Unicode Text, change all the embedded hard returns
( CHAR(10) ) into another character that you are not using (perhaps tilda).

When you re-import the file, the tildas will not be considered cell
delimiters. You can change them back into hard returns after the import.

Thanks, but that would break other apps that read the files, and would
require other apps that write the files to make the same change.

For the time being, I've just removed the carriage returns from the
one cell that had them. But I expect that the users will want more of
them in the future, so I don't see that as a long-term solution.

Since the files have an expected number of columns for all rows, I
think what I might do is, after importing the file, check the last
column of data for any blanks, and assume the blank was caused by a
carriage return. Then move the data back to the correct positions
piece by piece, inserting the Chr(10) carriage return characters as
needed, until I get to a row where the expected last column is
occupied. (This would all be in VBA, which is already doing the
import.)

Another option would be to do the import with the Input Statement.
That would be safer but possibly slow with these very large files.

It would certainly be less painful to find a way to get it to import
correctly. Any ideas?

By the way, it turns out to be weirder than I thought. Excel's help on
the Text Import Wizard says that if a delimeter (Tab) is found within
two text qualifier (doublequote) characters, then the the text
qualifier characters are removed. But if no delimiter is found between
two text qualifier characters, then the text qualifier characters are
preserved:

===============================================
Text qualifier
Select the character that encloses values in your text file. When
Excel encounters the text qualifier character, all of the text that
follows that character and precedes the next occurrence of that
character is imported as one value, even if the text contains a
delimiter character. For example, if the delimiter is a comma (,) and
the text qualifier is a quotation mark ("), "Dallas, Texas" would be
imported into one cell as Dallas, Texas. If no qualifier or if the
apostrophe (') qualifier were specified, "Dallas, Texas" would be
imported into two adjacent cells as "Dallas and Texas".

If the delimiter character occurs between text qualifiers, Excel omits
the qualifiers in the imported value. If no delimiter character occurs
between text qualifiers, Excel includes the qualifier character in the
imported value. Hence, "Dallas Texas" (using the quotation mark text
qualifier) would be imported into one cell as "Dallas Texas".
===============================================

When Excel saves it out as Unicode Text, it encloses the cell contents
containing carriage returns in doublequote characters:

========================
"First text rowNext text row"
========================

But when Excel imports that text file, look what it does:

========================
(in the correct cell)
First text row

(in column A of the next row down)
Next text row"
========================

The doublequote is removed from the first part, as if it had found a
delimiter enclosed by doublequotes, while the doublequote is preserved
on the last part, as if it did NOT find a delimiter enclosed by
doublequotes. Weird.

Thanks,

Greg

Greg Lovern · Aug 26, 2009

The doublequote is removed from the first part, as if it had found a
delimiter enclosed by doublequotes, while the doublequote is preserved
on the last part, as if it did NOT find a delimiter enclosed by
doublequotes. Weird.

Sorry about so many posts, but I just had to add:

It's even more weird saving as CSV. When the CSV file is imported, all
text in each cell after the cell's first in-cell carriage returns is
just deleted.

For example, if I save this cell out as CSV:

========================
First text row

Next text row
========================

It imports like this:

========================
First text row
========================

If I look at the CSV in Notepad, it looks the same as when saved out
as Unicode Text:

========================
"First text rowNext text row"
========================

BTW, if I arrow across the characters in Notepad, it pauses between
the first "w" and the "N", apparently indicating the presence of a
character that has a width of zero -- the carriage return of course.

Also BTW - same problem with plain text (tab delim) as with Unicode
text).

Greg

Excel - carriage returns and formating of text within a cell	2	Dec 4, 2009
How do I get carriage returns to not be row delimiters in text fi.	1	Dec 16, 2004
Importing Text with Carriage Returns inside cell values	2	May 14, 2009
Carriage return displays as a small box	8	Sep 14, 2009
carriage return in text cell in excell	2	May 23, 2008
Getting rid of carriage return in cell	4	Sep 5, 2008
excel 2007 text box - carriage return	1	Oct 30, 2009
How can I find and delete tabs and carriage returns ?	1	Nov 11, 2005

Unicode text and carriage returns

Greg Lovern

Gary''s Student

Greg Lovern

Greg Lovern

Greg Lovern

Ask a Question

Similar Threads