debug repeat entries with a random function and erase the rest

D

drurjen

Good morning. I have a txt file with 4 comma delimited fields: Date,
ID, Time and I/O. The fields are generated by a biometric time
attendance system. The main problem I have is debugging repeat entries.
For example,

If an employee punches in twice, I'd have the following:

11/27/2006 16 16:31:27 2
11/27/2006 16 16:32:21 2

What I want is to select a random entry of these within a 5 minute
interval and erase the rest. Any ideas on how I might achieve this? Any
better ideas on how to accomplish this?

Cheers and thx.
 
D

drurjen

Alok,

Thank you for the reply. I tried the code, but it seems to erase all
non-unique files and not just the ones who are the same within a 5
minute interval.

The following file:
http://hacktegus.com/test.csv

results in:

11/27/2006 16 1/0/1900 2
11/27/2006 27 1/0/1900 1
12/4/2006 31 1/0/1900 1
12/4/2006 36 1/0/1900 2
11/27/2006 45 1/0/1900 2
11/27/2006 64 1/0/1900 1
12/4/2006 73 1/0/1900 1
11/27/2006 95 1/0/1900 1
11/27/2006 97 1/0/1900 1
11/27/2006 149 1/0/1900 1
12/4/2006 152 1/0/1900 2
12/4/2006 182 1/0/1900 1
12/4/2006 193 1/0/1900 1
12/4/2006 194 1/0/1900 2
12/4/2006 262 1/0/1900 1
11/27/2006 366 1/0/1900 1
11/27/2006 384 1/0/1900 1
12/4/2006 466 1/0/1900 1
11/27/2006 472 1/0/1900 1
12/4/2006 494 1/0/1900 1
11/27/2006 586 1/0/1900 1
11/27/2006 636 1/0/1900 1
11/27/2006 696 1/0/1900 1
11/27/2006 699 1/0/1900 1
12/4/2006 754 1/0/1900 1
12/4/2006 971 1/0/1900 1
12/4/2006 984 1/0/1900 1
11/27/2006 996 1/0/1900 1
12/4/2006 1001 1/0/1900 2
11/27/2006 1006 1/0/1900 2
11/27/2006 1023 1/0/1900 1
12/4/2006 1033 1/0/1900 1

*I am using msado15.dll

Cheers mate.
 
D

drurjen

Here's how to delete rows with empty cells:
http://www.excelforum.com/showthread.php?threadid=543108

Thx Alok! Code worked awesomely!!!
Hi drurjen
I misunderstood the problem.
Given what you want to do, we cannot get the desired result using a query.
One has to step through individual records.
Here is the code that does so.
Just make sure that all records have all the items of data. I noticed some
have missing IO values

Sub Test()
Dim cn As ADODB.Connection
Dim rs As ADODB.Recordset
Dim strSQL$
Dim vFileAndFolder
Dim sFolder$, sFile$

Dim dOldDate As Date
Dim wOldID As Integer
Dim dOldTime As Date
Dim wOldIO As Integer
Dim bWriteOut As Boolean
Dim lRow As Long
Dim dTimeDiff As Double

'Convert the time difference in a double
dTimeDiff = 5 / (24 * 60)

vFileAndFolder = Application.GetOpenFilename()
If TypeName(vFileAndFolder) = "String" Then
sFolder = Mid(vFileAndFolder, 1, InStrRev(vFileAndFolder, "\"))
sFile = Mid(vFileAndFolder, InStrRev(vFileAndFolder, "\") + 1)

Set cn = New ADODB.Connection
On Error Resume Next
cn.Open "Driver={Microsoft Text Driver (*.txt; *.csv)};" & _
"Dbq=" & sFolder & ";" & _
"Extensions=asc,csv,tab,txt;"
On Error GoTo 0
If cn.State = ADODB.adStateOpen Then
Set rs = New ADODB.Recordset
strSQL = "Select [Date],ID,[Time],[IO] from " & sFile
rs.Open strSQL, cn, adOpenForwardOnly, adLockReadOnly, adCmdText


'Sheet1.Cells(1, 1).CopyFromRecordset rs
lRow = 1
Do While Not rs.EOF
bWriteOut = False
If wOldID = rs("ID").Value And wOldIO = rs("IO").Value Then
If dOldDate = rs("Date").Value Then
If rs("Time").Value - dOldTime > dTimeDiff Then
bWriteOut = True
End If
ElseIf dOldDate < rs("Date").Value Then
'One full day is equivalent to 1.00
If rs("Time").Value + 1 - dOldTime > dTimeDiff Then
bWriteOut = True
End If
End If
Else
bWriteOut = True
End If
If bWriteOut Then
With Sheet1.Range(Sheet1.Cells(lRow, 1),
Sheet1.Cells(lRow, 4))
.Value = Array(rs("Date").Value, rs("ID").Value,
rs("Time").Value, rs("IO").Value)
End With
lRow = lRow + 1
End If
On Error Resume Next
wOldID = rs("ID").Value
wOldIO = rs("IO").Value
dOldDate = rs("Date").Value
dOldTime = rs("Time").Value
On Error GoTo 0
rs.MoveNext
Loop
rs.Close
Set rs = Nothing
cn.Close
Set cn = Nothing
End If
End If
End Sub

Alok


drurjen said:
Alok,

Thank you for the reply. I tried the code, but it seems to erase all
non-unique files and not just the ones who are the same within a 5
minute interval.

The following file:
http://hacktegus.com/test.csv

results in:

11/27/2006 16 1/0/1900 2
11/27/2006 27 1/0/1900 1
12/4/2006 31 1/0/1900 1
12/4/2006 36 1/0/1900 2
11/27/2006 45 1/0/1900 2
11/27/2006 64 1/0/1900 1
12/4/2006 73 1/0/1900 1
11/27/2006 95 1/0/1900 1
11/27/2006 97 1/0/1900 1
11/27/2006 149 1/0/1900 1
12/4/2006 152 1/0/1900 2
12/4/2006 182 1/0/1900 1
12/4/2006 193 1/0/1900 1
12/4/2006 194 1/0/1900 2
12/4/2006 262 1/0/1900 1
11/27/2006 366 1/0/1900 1
11/27/2006 384 1/0/1900 1
12/4/2006 466 1/0/1900 1
11/27/2006 472 1/0/1900 1
12/4/2006 494 1/0/1900 1
11/27/2006 586 1/0/1900 1
11/27/2006 636 1/0/1900 1
11/27/2006 696 1/0/1900 1
11/27/2006 699 1/0/1900 1
12/4/2006 754 1/0/1900 1
12/4/2006 971 1/0/1900 1
12/4/2006 984 1/0/1900 1
11/27/2006 996 1/0/1900 1
12/4/2006 1001 1/0/1900 2
11/27/2006 1006 1/0/1900 2
11/27/2006 1023 1/0/1900 1
12/4/2006 1033 1/0/1900 1

*I am using msado15.dll

Cheers mate.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top