Help with sorting and conpacting data

  • Thread starter Thread starter Dan
  • Start date Start date
D

Dan

Hi guys (and maybe gals?)

I've written a couple routines that allow me to import text data from a file
and place each individual row from the text file into cells(row,1) till the
file ends
i end up with about 30,000+ entries. I've written an algorythm that then
scans the cells and removes the ones that don't fall into a specific
criteria. This alone takes about a minute and a half. From this point i have
to remove all duplicate entries and keep a counter in the adjacent cell.of
the original item.
for example

red apples 1 ===> red apples 2
red apples 1 green apples 1
green apples 1

The way that i have done this is by sorting the list after all the list is
created from the original file and then running throught the list on row at
a time using a while loop when an item is found to sheck for dupes. if the
next item in the list is the same as the current item, it deletes the next
item, ups the surrent items value by one, and checks the next.... and so on.

While this does exactly what i need, it takes it a few mins to complete.
Since i need to do this entire process 7 times in one setting, it's already
going to take at least 10 mins just to clean the junk out the text files. it
would be nice to be able to speed up t least one of these processes.

Anyone have any info on alternatives to this? The information doen't have to
be on a worksheet so I don't mind working with arrays or collections.

Any help would be great. Thanks alot!

Dan Butler
(e-mail address removed)
 
I checked out Chip's site. His method does basically the same as mine. I
think ideally i owuld like to skip the sorting and duplicate compacting
process all together if possible.
When my macro removes the bogus junk from the text file ite sets cell1 with
a name, such as red apples, and cell 2 with a value of 1. In the end, there
may be 50 "red apples" all with a value of one ehich then is checked with a
for loop and ompacted to give | red apples | 50 |
I know with collections you can't have duplicates would it be possible to
maybe add the name into a collection not allowing duplicates. is there a
fast way of keeping track? I dunno i'm searching for anything now :)

thanks
Dan Butler
(e-mail address removed)
 
Would it be feasible to create a pivot table from this data then copy
and paste values to leave just the summarised data?
 
or using subtotals & selecting visible cells only & copying and pasting
these?

i'm really not sure how this solution will work on 30000 rows timewise
though

JulieD
 
The dictionary object from the scripting library offers some
functionality that looks suited to this problem. Here's some untested
code that may help:


Public Sub LoadTextFile(ByVal Filename As String)
Dim dicLoad As Scripting.Dictionary
Dim intFileNum As Integer
Dim strLine As String

Set dicLoad = New Scripting.Dictionary
intFileNum = FreeFile()
Open Filename For Input As #intFileNum

Do
Line Input #intFileNum, strLine
If dicLoad.Exists(strLine) Then
dicLoad.Item(strLine) = dicLoad.Item(strLine) + 1
Else
dicLoad.Add strLine, 1
End If
Loop Until EOF(intFileNum)
Close #intFileNum

Sheet1.Cells(1, 1).Resize(, dicLoad.Count).Value = dicLoad.Keys
Sheet1.Cells(2, 1).Resize(, dicLoad.Count).Value = dicLoad.Items
Set dicLoad = Nothing
End Sub


HTH, Thomas.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top