Splitting a Data-Pool

  • Thread starter Thread starter n3mesis
  • Start date Start date
N

n3mesis

Hi,

i'm from Germany, so i hope, i can explain my prob understandly.

I habe a Table with about 50 Entries (sometimes more or less) with Customer
and Orderqauntity.
Now i'll try to split this table into 2 tables with the half count of
Customers where in the summary having the best possible half of the summary
of the complete Pool.

I mean i'll get 50 Percent of the Count of Customers
The summary of their Qty must have by nearly the half of the complete Pool.

Mike
 
Nobody an Idea?
Hi,

i'm from Germany, so i hope, i can explain my prob understandly.

I habe a Table with about 50 Entries (sometimes more or less) with Customer
and Orderqauntity.
Now i'll try to split this table into 2 tables with the half count of
Customers where in the summary having the best possible half of the summary
of the complete Pool.

I mean i'll get 50 Percent of the Count of Customers
The summary of their Qty must have by nearly the half of the complete Pool.

Mike
 
n3mesis said:
i'm from Germany, so i hope, i can explain my prob understandly.

I habe a Table with about 50 Entries (sometimes more or less) with
Customer
and Orderqauntity.
Now i'll try to split this table into 2 tables with the half count of
Customers where in the summary having the best possible half of the
summary
of the complete Pool.

I mean i'll get 50 Percent of the Count of Customers
The summary of their Qty must have by nearly the half of the complete
Pool.
Hi Mike,

I think you want to add the Count
and SumOfQty together for each Customer,
sort by this "Weight" Desc,
and select TOP 50%.

divide-and-conquer

qryCntSumQty:

Select
Customer,
Count(*) As Cnt,
Sum(Qty) As SumQty,
Max(Cnt + SumQty) As Wght
FROM
yurtable
GROUP BY Customer;

then using above query:

Select TOP 50%
Customer,
Cnt,
SumQty
From
qryCntSumQty
ORDER BY Wght;
 
At time id do somethink like this, but its not flexible enought.

I have a limited number of work areas, and must the quantity of orders
reasonably divide.

Accepted, I have 2 work areas, would like I the quantity of orders to halve,
or for example 65%:35% divide. That's simple, but I must consider, which
can divide the number of branches also for example 50:50 or 60:40.

We assume, I have 45 branches, who get together 30,000 articles.

I would like now 20 branches, which get together about 17,000 articles, and
25 branches, which determine together 13,000 articles gotten.
The number of branches and hums those articles must be variable.
And i can't split der orders of a single branch.

The best one compilation of the branches is to determine ACCESS, so that the
result is as close as possible at the default.

That's what i need

Gary said:
i'm from Germany, so i hope, i can explain my prob understandly.
[quoted text clipped - 9 lines]
The summary of their Qty must have by nearly the half of the complete
Pool.

Hi Mike,

I think you want to add the Count
and SumOfQty together for each Customer,
sort by this "Weight" Desc,
and select TOP 50%.

divide-and-conquer

qryCntSumQty:

Select
Customer,
Count(*) As Cnt,
Sum(Qty) As SumQty,
Max(Cnt + SumQty) As Wght
FROM
yurtable
GROUP BY Customer;

then using above query:

Select TOP 50%
Customer,
Cnt,
SumQty
From
qryCntSumQty
ORDER BY Wght;
 
At time id do somethink like this, but its not flexible enought.

I have a limited number of work areas, and must the quantity of orders
reasonably divide.

Accepted, I have 2 work areas, would like I the quantity of orders to halve,
or for example 65%:35% divide. That's simple, but I must consider, which
can divide the number of branches also for example 50:50 or 60:40.

We assume, I have 45 branches, who get together 30,000 articles.

I would like now 20 branches, which get together about 17,000 articles, and
25 branches, which determine together 13,000 articles gotten.
The number of branches and hums those articles must be variable.
And i can't split der orders of a single branch.

The best one compilation of the branches is to determine ACCESS, so that the
result is as close as possible at the default.

That's what i need
[quoted text clipped - 31 lines]
qryCntSumQty
ORDER BY Wght;
 
Hi Mike,

Create 2 queries that are the same
(doesn't matter what they are) and save
as

qryTestAsc
qryTestDsc

Then create union query "qryTest"

SELECT * FROM qryTestAsc
UNION
SELECT * FROM qryTestDsc

start a new module, then copy and paste
following code into that module

'*** start of code ****
Option Explicit

Public Function fGetHalf(pTable As String, _
pNumField As String, _
pWithIn As Long) As Boolean
On Error GoTo Err_fGetHalf
Dim lngHalfSet As Long
Dim lngHalfTotals As Long
Dim lngTopAsc As Long
Dim lngTopDsc As Long
Dim strSQL As String
Dim lngTestTotal As Long
Dim lngTestCnt As Long
Dim lngCnt As Long

lngHalfSet = DCount("*", pTable) \ 2
lngHalfTotals = DSum(pNumField, pTable) \ 2

lngTopAsc = lngHalfSet \ 2 'small nums
lngTopDsc = lngHalfSet - lngTopAsc 'big nums
lngCnt = 0

Do While Abs(lngTestTotal - lngHalfTotals) > pWithIn And lngCnt < 20
'get big nums
strSQL = "SELECT TOP " & lngTopAsc & " " & pTable & ".* FROM " _
& pTable & " ORDER BY " & pTable & "." & pNumField
CurrentDb.QueryDefs("qryTestAsc").SQL = strSQL
'get small nums
strSQL = "SELECT TOP " & lngTopDsc & " " & pTable & ".* FROM " _
& pTable & " ORDER BY " & pTable & "." & pNumField & " DESC"
CurrentDb.QueryDefs("qryTestDsc").SQL = strSQL

'qryTest.SQL = "SELECT * FROM qryTestAsc " _
' UNION SELECT * FROM qryTestDsc"
lngTestTotal = DSum(pNumField, "qryTest")
lngTestCnt = DCount("*", "qryTest")

Debug.Print "TopAsc: " & lngTopAsc & ";TopDsc: " & lngTopDsc _
& ";SumTotal: " & lngTestTotal & ";TestCount: " & lngTestCnt

If lngTestTotal > lngHalfTotals Then
If lngTestCnt > lngHalfSet Then
'sum too big and cnt too big
lngTopAsc = lngTopAsc - 1
lngTopDsc = lngTopDsc - 1
Else
'sum too big and cnt too small
'--> more small nums, less big nums
lngTopAsc = lngTopAsc + 1
lngTopDsc = lngTopDsc - 1
End If
Else
If lngTestCnt > lngHalfSet Then
'sum too small and cnt too big
'--> more big nums, less small nums
lngTopAsc = lngTopAsc - 1
lngTopDsc = lngTopDsc + 1
Else
'sum too small and cnt too small
'--> more big and small nums
lngTopAsc = lngTopAsc + 1
lngTopDsc = lngTopDsc + 1
End If
End If

If lngTopAsc = 0 Then lngTopAsc = 1
If lngTopDsc = 0 Then lngTopDsc = 1

lngCnt = lngCnt + 1
Loop

'save result to tblGroup1
If TableExists("tblGroup1") Then
CurrentDb.Execute "DROP table tblGroup1", dbFailOnError
Else

End If

strSQL = "SELECT * INTO tblGroup1 FROM qryTest "
CurrentDb.Execute strSQL, dbFailOnError

fGetHalf = True

Exit_fGetHalf:
Exit Function
Err_fGetHalf:
MsgBox Err.Description
Resume Exit_fGetHalf
End Function

Public Function TableExists(strTableName As String) As Boolean
'from Joe Fallon
On Error Resume Next
TableExists = IsObject(CurrentDb.TableDefs(strTableName))
End Function
'***** code end ***********

save your module and run Debug/Compile

then in Immediate window, type something like:

?fgethalf("tblBranch","NumArticles", 3000)

Where I had a "tblBranch" with 26 records and the
sum of the field "NumArticles" was 80,000.

Here be results in Immediate Window

TopAsc: 6;TopDsc: 7;SumTotal: 53690;TestCount: 13
TopAsc: 7;TopDsc: 6;SumTotal: 48640;TestCount: 13
TopAsc: 8;TopDsc: 5;SumTotal: 43500;TestCount: 13
TopAsc: 9;TopDsc: 4;SumTotal: 37100;TestCount: 13
True

So, in "tblGroup1" I ended up with 13 records
and the sum of articles = 37,100.

Is that close enough?

To get "tblGroup2," just use query to weed out
all records that are in tblGroup1 from original
table.

I didn't give this a lot of thought, so you may see
some tweaks to the algorithm.

good luck,

gary





"n3mesis via AccessMonster.com"wrote:
At time id do somethink like this, but its not flexible enought.

I have a limited number of work areas, and must the quantity of orders
reasonably divide.

Accepted, I have 2 work areas, would like I the quantity of orders to
halve,
or for example 65%:35% divide. That's simple, but I must consider,
which
can divide the number of branches also for example 50:50 or 60:40.

We assume, I have 45 branches, who get together 30,000 articles.

I would like now 20 branches, which get together about 17,000 articles,
and
25 branches, which determine together 13,000 articles gotten.
The number of branches and hums those articles must be variable.
And i can't split der orders of a single branch.

The best one compilation of the branches is to determine ACCESS, so that
the
result is as close as possible at the default.

That's what i need

Gary said:
i'm from Germany, so i hope, i can explain my prob understandly.
[quoted text clipped - 9 lines]
The summary of their Qty must have by nearly the half of the complete
Pool.

Hi Mike,

I think you want to add the Count
and SumOfQty together for each Customer,
sort by this "Weight" Desc,
and select TOP 50%.

divide-and-conquer

qryCntSumQty:

Select
Customer,
Count(*) As Cnt,
Sum(Qty) As SumQty,
Max(Cnt + SumQty) As Wght
FROM
yurtable
GROUP BY Customer;

then using above query:

Select TOP 50%
Customer,
Cnt,
SumQty
From
qryCntSumQty
ORDER BY Wght;
 
was playing around...

may not work depending on distribution of your data...

Public Function fGetRndHalf(pTable As String, _
pGroupField As String, _
pSumField As String, _
pLoopIterations As Long, _
pWithIn As Long) As Boolean
On Error GoTo Err_fGetRndHalf
'Given a set of data, going to assume can randomly
'choose half of the set over many cycles, and come
'up with a sum of the number field within that set
'that comes close to half the sum of the entire set.
'
'will first need to create:
'1) tblRandHalf (3 fields)
' Cycle - long = where at in loop (lngCnt)
' GroupField - text = pGroupField
' SumField - long = pSumField
'
'2) qryRndHalfSum
' can be any query since it gets redefined in function
'call example from Immediate Window for a "tblBranch":
'?fgetrndhalf("tblBranch","BranchID","NumArticles",100, 1000)

Dim lngFullSetCnt As Long
Dim lngFullSetSum As Long
Dim lngHalfSetCnt As Long
Dim lngHalfSetSum As Long
Dim lngTestSum As Long
Dim strSQL As String
Dim lngCnt As Long

DoCmd.Hourglass True

lngFullSetCnt = DCount("*", pTable)
lngFullSetSum = DSum(pSumField, pTable)

lngHalfSetCnt = lngFullSetCnt \ 2
lngHalfSetSum = lngFullSetSum \ 2

Randomize


CurrentDb.Execute "DELETE * FROM tblRndHalf", dbFailOnError

For lngCnt = 1 To pLoopIterations
'randomly get lngHalfSetCnt number of records from pTable
' and save them in tblRndHalf w/ associated loop count
strSQL = "INSERT INTO tblRndHalf (Cycle, GroupField, SumField) " _
& "SELECT TOP " & lngHalfSetCnt & " " & lngCnt _
& ", " & pGroupField & ", " & pSumField & " " _
& "FROM " & pTable & " ORDER BY RND([" & pSumField & "])"
CurrentDb.Execute strSQL, dbFailOnError

lngTestSum = DSum("SumField", "tblRndHalf", "[Cycle]=" & lngCnt)

'did we randomly select a set that comes within goal
If Abs(lngTestSum - lngHalfSetSum) <= pWithIn Then
'YES! so save group to tblGroup1
If TableExists("tblGroup1") Then
CurrentDb.Execute "DROP table tblGroup1", dbFailOnError
End If

strSQL = "SELECT " & pGroupField & ", " & pSumField _
& " INTO tblGroup1 FROM " & pTable _
& " WHERE CStr(" & pGroupField & " & '') IN " _
& "(SELECT GroupField FROM tblRndHalf WHERE [Cycle]= " &
lngCnt & ");"
'Debug.Print strSQL
CurrentDb.Execute strSQL, dbFailOnError

fGetRndHalf = True

MsgBox "Successfully found set after " & lngCnt & " loops whose
sum" _
& vbCrLf & "was within " & pWithIn & " of half the full
set's sum" _
& vbCrLf & "and saved the set in tblGroup1."

'exit function
GoTo Exit_fGetRndHalf
End If
Next lngCnt

'if did not meet goal after pLoopIterations,
'choose set that comes closest to half a set.
'Example: if lngHalfSetSum were equal to 40000,
'we can identify the Cycle that came closest by:
'SELECT TOP 1 tblRndHalf.Cycle
'FROM tblRndHalf
'GROUP BY tblRndHalf.Cycle
'ORDER BY Abs(40000-Sum([tblRndHalf].[SumField]));

strSQL = "SELECT TOP 1 Cycle FROM tblRndHalf GROUP BY Cycle " _
& "ORDER BY Abs(" & lngHalfSetSum & " -
Sum([tblRndHalf].[SumField]));"
CurrentDb.QueryDefs("qryRndHalfSum").SQL = strSQL

'save set for this cycle in tblGroup1
If TableExists("tblGroup1") Then
CurrentDb.Execute "DROP table tblGroup1", dbFailOnError
End If

lngCnt = DLookup("Cycle", "qryRndHalfSum")
strSQL = "SELECT " & pGroupField & ", " & pSumField _
& " INTO tblGroup1 FROM " & pTable _
& " WHERE CStr(" & pGroupField & " & '') IN " _
& "(SELECT GroupField FROM tblRndHalf WHERE [Cycle]= " & lngCnt
& ");"
'Debug.Print strSQL
CurrentDb.Execute strSQL, dbFailOnError

lngTestSum = DSum("SumField", "tblRndHalf", "[Cycle]=" & lngCnt)

MsgBox "Could only find a random set whose sum " _
& vbCrLf & "was within " & Abs(lngTestSum - lngHalfSetSum) & "
of half the full set's sum" _
& vbCrLf & "but saved the set in tblGroup1."

fGetRndHalf = True

Exit_fGetRndHalf:
DoCmd.Hourglass False
Exit Function

Err_fGetRndHalf:
MsgBox Err.Description
Resume Exit_fGetRndHalf


End Function
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top