Bulk Row Deletion - Fastest method

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have tried several different methods of deleting a lot of rows quickly.

I use test data in a single Excel sheet containing 54,102 rows by 22
columns. This data contained 24,266 rows targeted for deletion that were
interspersed within the data (all based on actual events routinely
encountered).

The fastest method I have found in benchmarking tests is to:

1. Use a "For...Each" loop to mark the rows for deletion in the sheet
2. Sort the data so that all the target rows are together
3. Delete all the target rows at the same time
4. If necessary, re-sort the data

This ran fastest (6.5 minutes - which still seems like a long time when you
have to do it repeatedly). Does anyone know of a faster method? If so, could
you please post your method? FYI, other methods I tried:

1. Copy target row numbers into an array, then step backward through the
array and delete the rows.

2. Filter the target rows, then delete the visible range.

3. Step backward through all rows in the sheet and immediatley delete any
target rows encountered.

4. These are all the methods I could think of. Anyone else have any ideas to
shave some time off of this process?

Thanks much in advance.
 
Hi
The following may be faster:
Assuming data starting in column B row 2:
0- Turn ScreenUpdating off
1- In Column A, put a formula testing if the row needs to be deleted.
Return a number if Yes (eg: 1), a string if No (eg: 'A').
(returning a number for yes and a string for No make it easier to find
it later)
2. From column A, using SpecialCells, get the cell with Formula AND number.
They are the one you want to delete
3. Extend to the whole rows and Delete them in 1 shot.

Code:
--------------------------------------------------------------------
Sub Macro()
Dim rgTest As Range 'where the formula will go
Dim rgToDelete As Range 'cells of rgTest to be deleted

Set rgTest = Application.Intersect(Range("A2:A65526"),
ActiveSheet.UsedRange.EntireRow)
strF = "=IF(AND(B2=""A"",C2=1),1,""A"")" 'Condition B2="A" AND C2=1

Application.ScreenUpdating = False
rgTest.Formula = strF 'it will adjust the formula for each row
Set rgToDelete = rgTest.SpecialCells(xlCellTypeFormulas, 1) 'find
formula result being a number
Set rgToDelete = rgToDelete.EntireRow 'extend to whole rows
rgToDelete.Delete
Application.ScreenUpdating = True
End Sub
'----------------------------------------------------------

Just change the 2 first line:
Set rgTest = ...
strF=...
the the right column to place the formula (if col A is not available) and to
the proper formula.

I hope this helps
Regards,
Sebastien
 
Another option is to use autofilter and delete visible cells only via code (or
even manually)
 
Maybe it would it be quicker to create a range object containing all the
targetted rows and then delete them in one hit?
 
OP is talking 8000 rows plus..
Unions get mighty slow if the area count goes above 400 or so..

although he could test for area count and "flush" the union
when the count goes over the threshold.


--
keepITcool
| www.XLsupport.com | keepITcool chello nl | amsterdam


Sean wrote :
 
Hi Quartz,
The fastest method I have found in benchmarking tests is to:

1. Use a "For...Each" loop to mark the rows for deletion in the sheet
2. Sort the data so that all the target rows are together
3. Delete all the target rows at the same time
4. If necessary, re-sort the data

I would do the following:

1. Put a formula in all the cells to identify which ones to delete,
0=keep, 1=delete
2. Sort the data by that column, leaving the '1's at the bottom
3. Count the '1's column, which gives the number to delete
4. Delete all the '1' rows in one go.
5. Clear the column of formulas

There's no need for a re-sort, as all the 'kept' data will be in its
original order.

Regards

Stephen Bullen
Microsoft MVP - Excel
www.oaltd.co.uk
 
quartz said:
I have tried several different methods of deleting a lot of rows quickly.

I use test data in a single Excel sheet containing 54,102 rows by 22
columns. This data contained 24,266 rows targeted for deletion

The fastest method I have found in benchmarking tests is to... use
a "For...Each" loop to mark the rows for deletion in the sheet <snip>

This ran fastest (6.5 minutes - which still seems like a long time when you
have to do it repeatedly). Does anyone know of a faster method?

Try this approach:

- copy the sheet to a new blank workbook;
- save and close this temp workbook;
- use a SQL query on the closed temp workbook to create an in-memory
ADO recordset of the data you want to *keep* (if you don't specify an
ORDER BY clause then the existing sort order will be retained);
- delete all the data in the original sheet;
- use CopyFromRecordset to populate the original sheet using the
retained data.

This should easily run in under 6.5 seconds... I'm sorry, did you say
*minutes*?!

Jamie.

--
 
Grüezi quartz

quartz schrieb am 09.11.2004
I have tried several different methods of deleting a lot of rows quickly.

I use test data in a single Excel sheet containing 54,102 rows by 22
columns. This data contained 24,266 rows targeted for deletion that were
interspersed within the data (all based on actual events routinely
encountered).

The fastest method I have found in benchmarking tests is to:

1. Use a "For...Each" loop to mark the rows for deletion in the sheet
2. Sort the data so that all the target rows are together
3. Delete all the target rows at the same time
4. If necessary, re-sort the data

How about an advanced filter to another worksheet and delete the old one?

In the critereia-fields you define the values you want to exlude.

--
Regards

Thomas Ramel
- MVP for Microsoft-Excel -

[Win XP Pro SP-1 / xl2000 SP-3]
 
I think you have since received solutions you are more likely to adopt, but
a quick comment about SpecialCells (SC), particularly in a sheet as large as
yours.

If SC is expected to return a little over 8,000 non contiguous areas - it
won't return any, and neither will it alert you an error. With your 50k
rows that could be a scenario. Also, but not serious, SC is relatively slow
to return more than a few thousand areas, albeit significantly faster than
looping and building up a range object.

If you are going to use SC in your sheet for data in a single column, do in
maximum chuncks of 16k rows, or less.

Regards,
Peter
 
Peter,

it's far worse than NOT returning a range or returning NOTHING
or ATLEAST returning/raising an error!.

IT WILL return a range...of the entire area!

btw the exact limit is 8192 areas
 
Hi keepITcool,
it's far worse than NOT returning a range or returning NOTHING
or ATLEAST returning/raising an error!.

IT WILL return a range...of the entire area!

Indeed, I phrased this point very poorly, thanks for clarifying.
btw the exact limit is 8192 areas

I recall Norman Jones demonstrating the limit could be 8191 and perhaps
less. Hence my vague "a little over 8,000".

Regards,
Peter
 
Back
Top