Export CSV dash is \226 (unicode)

W

wjr

Received an excel file which needs to be exported into a plain ascii csv
file. It appears that the originator had smart quoting or some kind of
unicode setup for this spreadsheet. The system which requires the cvs
file can not use unicode characters. The - characters must be a simple
hex 2d type of dash as in:

| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del|

How to strip this out?
 
G

Gord Dibben

You want to get rid of all the "pipes" that de-limiting the text?

Pipe character is found above Enter key Shift\

Edit>Replace

What: Alt + 0124 ( enter digits via Numpad)

With: nothing


Gord Dibben MS Excel MVP
 
W

wjr

Gord said:
You want to get rid of all the "pipes" that de-limiting the text?

Pipe character is found above Enter key Shift\

Edit>Replace

What: Alt + 0124 ( enter digits via Numpad)

With: nothing


Gord Dibben MS Excel MVP

Received an excel file which needs to be exported into a plain ascii csv
file. It appears that the originator had smart quoting or some kind of
unicode setup for this spreadsheet. The system which requires the cvs
file can not use unicode characters. The - characters must be a simple
hex 2d type of dash as in:

| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del|

How to strip this out?
Not the problem at all. I was showing the ascii table for those who
don't know it.

Start from here in case you are confused.
Here is a real piece of data saved from excel as csv file. NOTE: it's
part of the first field, so don't get confused by the lack of a ','
character.
AS SEEN in EXCEL:
Animal Rights - General

AS SAVED TO CSV:
Animal Rights û General

Here is an octal dump of the saved part:
od -bc x
0000000 101 156 151 155 141 154 040 122 151 147 150 164 163 040 226 040
A n i m a l R i g h t s 226
0000020 107 145 156 145 162 141 154 040 040 040 012
G e n e r a l \n
0000033

As you can very clearly see from this, the - dash charcter is 226 and
not a proper -.

Here is what I should be getting:
0000000 A n i m a l R i g h t s -
101 156 151 155 141 154 040 122 151 147 150 164 163 040 055 040
0000020 G e n e r a l \n
107 145 156 145 162 141 154 012
0000030
 
G

Gord Dibben

I sure missed this one!

I don't speak Octal<g>


Gord

Gord said:
You want to get rid of all the "pipes" that de-limiting the text?

Pipe character is found above Enter key Shift\

Edit>Replace

What: Alt + 0124 ( enter digits via Numpad)

With: nothing


Gord Dibben MS Excel MVP

Received an excel file which needs to be exported into a plain ascii csv
file. It appears that the originator had smart quoting or some kind of
unicode setup for this spreadsheet. The system which requires the cvs
file can not use unicode characters. The - characters must be a simple
hex 2d type of dash as in:

| 00 nul| 01 soh| 02 stx| 03 etx| 04 eot| 05 enq| 06 ack| 07 bel|
| 08 bs | 09 ht | 0a nl | 0b vt | 0c np | 0d cr | 0e so | 0f si |
| 10 dle| 11 dc1| 12 dc2| 13 dc3| 14 dc4| 15 nak| 16 syn| 17 etb|
| 18 can| 19 em | 1a sub| 1b esc| 1c fs | 1d gs | 1e rs | 1f us |
| 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' |
| 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / |
| 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 |
| 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? |
| 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G |
| 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O |
| 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W |
| 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ |
| 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g |
| 68 h | 69 i | 6a j | 6b k | 6c l | 6d m | 6e n | 6f o |
| 70 p | 71 q | 72 r | 73 s | 74 t | 75 u | 76 v | 77 w |
| 78 x | 79 y | 7a z | 7b { | 7c | | 7d } | 7e ~ | 7f del|

How to strip this out?
Not the problem at all. I was showing the ascii table for those who
don't know it.

Start from here in case you are confused.
Here is a real piece of data saved from excel as csv file. NOTE: it's
part of the first field, so don't get confused by the lack of a ','
character.
AS SEEN in EXCEL:
Animal Rights - General

AS SAVED TO CSV:
Animal Rights û General

Here is an octal dump of the saved part:
od -bc x
0000000 101 156 151 155 141 154 040 122 151 147 150 164 163 040 226 040
A n i m a l R i g h t s 226
0000020 107 145 156 145 162 141 154 040 040 040 012
G e n e r a l \n
0000033

As you can very clearly see from this, the - dash charcter is 226 and
not a proper -.

Here is what I should be getting:
0000000 A n i m a l R i g h t s -
101 156 151 155 141 154 040 122 151 147 150 164 163 040 055 040
0000020 G e n e r a l \n
107 145 156 145 162 141 154 012
0000030
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top