import multiple text files into single sheet separate column

J

Jootje

Hi,

I have a folder with over 200 txt files and I want to put those into a
single worksheet. Each file should be placed in a different column. That is,
file 1 into A2 and file 2 in B2 etc.

The data I want from the txt files concerns only one column. I recorded a
macro for it.

Workbooks.OpenText Filename:= _
"D:\report files\report files txt\1 totaal.txt" _
, Origin:=437, StartRow:=1, DataType:=xlDelimited, TextQualifier:= _
xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=False,
Semicolon:=False, _
Comma:=False, Space:=True, Other:=False, FieldInfo:=Array(Array(1,
9), _
Array(2, 1), Array(3, 9), Array(4, 9), Array(5, 9), Array(6, 9),
Array(7, 1)), _
TrailingMinusNumbers:=True
ActiveWindow.SmallScroll Down:=105

I think the point is, with all the macro's found on this newsgroup I don't
know how to use them or to alter them in my case. Probably change something
in the merge section? Or maybe not. Maybe a totally different approach? I
think the solution can be simple, but I don't see it.

Any ideas?
Thanks
 
R

Rick Rothstein \(MVP - VB\)

A couple of questions...

1. Do your text files all have a common name with a numerical identifier in
them. For example, the text file you show in your example is named "1
totaal.txt"... aside from the possible misspelling of the word 'total', do
the other 199 files have a number in front, followed by a space followed by
the exact text "totaal.txt"? If not, is there a way to identify these files
in a preferred ordering.

2. Is the content of each text file a series of rows of data consisting of a
single value?

Rick
 
J

Joel

How is the data delimited. Is it fixed column data or some other delimiter?
Post a sample of the data. the code you posted looks like it is fixed column
data, but I want to make sure.
 
J

Jootje

The text in the files all look like this. Delimiter is with 'space'.


N Relevance Result name Size Words count
1 D:\CODAS\alles\ascii\d2001-0263.txt 23 KB 2 060
2 D:\CODAS\alles\ascii\d2000-1267.txt 28 KB 2 376
3 D:\CODAS\alles\ascii\d2001-0156.txt 14 KB 1 292
4 D:\CODAS\alles\ascii\d2000-1824.txt 18 KB 1 658
5 D:\CODAS\alles\ascii\d2000-0410.txt 17 KB 1 596
6 D:\CODAS\alles\ascii\d2000-0192.txt 8 KB 779
 
J

Jootje

1.
The majority of the files start with a number because they represent the
factor I tried to identify from a legal decision. So yes, '1 totaal.txt', and
the file representing factor 2 that gives the total of all identified cases
with the factor present in it is named: 2 totaal.txt or sometimes the name of
the factor is in it like: '8 complainant businessman.txt'.

Totaal is dutch for total btw...my research data is in english however

I picked a few files randomly:

1 totaal.txt
2 totaal.txt
7 totaal author complainant + compl authored.txt
70 geographical identifiers.txt
76 TOT indications of source and geographical indications.txt
177 multiple respondents II respondent 2.txt
187 complaint deficient POS.txt
195a geen verweer respondent.txt
196 supplemental filings.txt
210 TOTAAL TRUE 3-panel 435.txt
231 legal considerations.txt
235 concurrent court proceedings.txt

2.
My original output files are in .html but I converted them to .txt.
The data in the html files are in a table like this (with URLs and tabs):

N Relevance Result name Size Words count
1 D:\CODAS\alles\ascii\d2001-0263.txt 23 KB 2 085
2 D:\CODAS\alles\ascii\d2000-1267.txt 28 KB 2 424
3 D:\CODAS\alles\ascii\d2001-0156.txt 14 KB 1 309
4 D:\CODAS\alles\ascii\d2000-1824.txt 18 KB 1 686
5 D:\CODAS\alles\ascii\d2000-0410.txt 17 KB 1 618
6 D:\CODAS\alles\ascii\d2000-1378.txt 21 KB 2 047


But the converted files are like this


SoftInform: SearchInform Desktop Enterprise search results


N Relevance Result name Size Words count
1 D:\CODAS\alles\ascii\d2001-0263.txt 23 KB 2 085
2 D:\CODAS\alles\ascii\d2000-1267.txt 28 KB 2 424
3 D:\CODAS\alles\ascii\d2001-0156.txt 14 KB 1 309
4 D:\CODAS\alles\ascii\d2000-1824.txt 18 KB 1 686
5 D:\CODAS\alles\ascii\d2000-0410.txt 17 KB 1 618
6 D:\CODAS\alles\ascii\d2000-1378.txt 21 KB 2 047
7 D:\CODAS\alles\ascii\d2000-0192.txt 8 KB 792
8 D:\CODAS\alles\ascii\d2000-0295.txt 31 KB 2 743
9 D:\CODAS\alles\ascii\d2000-1426.txt 17 KB 1 641
10 D:\CODAS\alles\ascii\d2001-0077.txt 14 KB 1 322
11 D:\CODAS\alles\ascii\d2002-0168.txt 15 KB 1 510

So without the tabs and URl's.
 
J

Jootje

Hmm

Maybe it's an idea to put the txt files into multiple sheets in one workbook
and then merge all the worksheets into a single worksheet?
 
R

Rick Rothstein \(MVP - VB\)

I am afraid I am still not sure exactly what you have or want. Let's see if
I can get you to help me understand exactly what you have/need...

You have 200 or so text files in a single, known directory, right? Each of
those files contains several lines of data... can you list the first, say, 5
lines of data from a typical file? Then, can you indicate what part (or is
it all) of, say, the 3rd line from this listing that you want to place in a
column of your spreadsheet?

Rick
 
J

Jootje

Yes, 200 or so text files in a single known directory. The names of the files
all start with a number. The number corresponds with the number of a factor
out of a list of 1-235, but only 173 factors are identified so in the
directory are 173 files to be exact. The data is arranged by name. I've put
the numbers of the 173 identified factors in the firs row of the worksheet.
The data I want should go in each of the 173 columns.


oke, the first 5 lines out of a typical file called '5 totaal region':

N Relevance Result name
Size Words count
1 D:\CODAS\alles\ascii\d2000-1377.txt 13 KB 1 213
2 D:\CODAS\alles\ascii\d2002-0643.txt 15 KB 1 388
3 D:\CODAS\alles\ascii\d2002-0003.txt 10 KB 934
4 D:\CODAS\alles\ascii\d2003-0076.txt 14 KB 1 289
5 D:\CODAS\alles\ascii\d2000-0491.txt 6 KB 657


It's the data in the second column I want, which I can extract with De
Bruin's macro, but that macro puts all the data in separate workbooks or
multiple sheets.

This is what I need:
D:\CODAS\alles\ascii\d2000-1377.txt

And then I will use 'find and replace' on it to get only the name of the case:
d2000-1377.

Hope this helps.
 
R

Rick Rothstein \(MVP - VB\)

Sorry to hit you up with a couple more questions...

1. a) Will any of the data in the 2nd field of each of any of those
text files **ever** have an embedded space character
within it?
b) Are the multiple spaces (perhaps tab characters) separating
your fields of data as you showed in your response to me or
is there only one as you indicated in a response to Joel
elsewhere in this thread?

I need to know the above in order to always be able to find the 2nd field in
each piece of data for each text file.

2. The "numbers of the 173 identified factors" that you placed in Row 1...

a) Are there only 173 of them to correspond with your known 173
text files (or do you have all 235 of them listed for possible
future filling in)?

b) Are they actual numbers or are they text (with possibly more
descriptive text around them)?

My plan is to give you a macro that puts exactly what you are looking for
(you won't need to use search/replace to fix it up later on) into each
labeled columns, but I need to know exactly how you set up the Row 1
labeling so I can find the right column for the right text file's data.

If you can think of anything else about how you have your information laid
out, don't hesitate to tell us... the more information we have, the better
able we are to craft a solution for you. For example, the sample data you
showed us for the 2nd field was this...

D:\CODAS\alles\ascii\d2000-1377.txt

from which you want the d2000-1377.txt... is the d2000-1377 part that you
want to place in the columns always 10 characters long... for **every**
piece of data in **every** text file, or could it vary in length?

Rick
 
J

Jootje

1a) I think so. I am using converted html files that are now .txt files, when
I tried to import a txt file into excel with the import wizard I used
delimitered txt, and I used 'space' for delimiter. Btw I made a mistake, it's
the 3rd field/column I need, with no embedded spaces in it that I know of.

1b) With the converted files only one space

2a) The list I had is from 1-235; and the sum of all the found factors are
173; no future fill ins.

2b) I think the format is 'general' ; just numbers. Maybe you're thinking
about putting the name of the file there instead? The whole name that
is...This might be more accurate...??


d2000-1377)

yes, always 10 characters


Johanna
 
R

Rick Rothstein \(MVP - VB\)

Clarification please...

2a - Do you have all 235 numbers listed in Row 1?

2b - What do you mean by "the whole name", the path also?

Rick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top