Create start and end number for 'blocks' of contiguous numbers

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have a large table with one column containing numbers. After sorting the
column, I need to identify the start and end number for each set of
contiguous numbers. Can I do this in Access? If so, how? e.g.

1234 -
1235 - Block: StartNum=1234 EndNum=1236
1236 -
2001
3004
4995 - Block: StartNum=4995 EndNum=4996
4996 -
5320

I'd like to compare the number in one row to the number in the next row and
determine if the numbers are contiguous. For each group of two or more
contiguous numbers, a start and end number for that found 'block' would be
established.
 
Yes,
easiest/fastest way is thru sorted recordset(rs) call using VBA, where
you loop thru rs checking the the next record for contiguousness; if
true save previous value as start and then check for next noncontiguous
record and save as end value. Write or output value as needed.
if you prefer a query type result(probably slow and will req 4
queries):

a. create query that will have 2 field that uses 'dcount' function to
look for the +1 and -1 value of the number,therefore, if dcount of +1 =
1 and -1 = 0 then that is a start number while 0,1(respectively)
indicate end number,ex:
less: DCount("[Number]","tblNum",("[Number] = " & ([Number]-1)))
more: DCount("[Number]","tblNum",("[Number] = " & ([Number]+1)))

b. create 2 query, one that filter out the start and one to filter the
end value. Create a field in each query to count the number of
start/end values < the original number, ex:
assignnumber: DCount("[Number]","query created in a","[startend] =
'start' and [Number] < " & [number]); this will by default number the
query records

c. inner join the 2 queries in b using 'assignnumber' field
 
if you prefer a query type result(probably slow and will req 4
queries)

You are forgetting that MS Access/Jet can be wonderfully non-relational
at times. Such 'features' may be exploited as in the following *one*
query:

SELECT
T1.data_col,
(
SELECT COUNT(*) FROM Test WHERE data_col = T1.data_col - 1
) AS RowBefore,
(
SELECT COUNT(*) FROM Test WHERE data_col = T1.data_col + 1
) AS RowAfter,
Switch(
RowAfter AND NOT RowBefore,
'Block: StartNumber = ' & T1.data_col,
RowBefore AND RowAfter,
'Block',
RowBefore AND NOT RowAfter,
'Block: EndNumber = ' & T1.data_col
) AS description
FROM Test AS T1;

You wouldn't be able to get away with this in a 'real' relational
database <g>.

Jamie.

--
 
Sweet!
I like how you wrote it. Elegant way of doing my section a. Can you
extend it to pair the startnumber and endnumber in the same record (and
in one query)?(thus section b & c for me) I'm incline to say yes, but
not incline to try to do it.

Deb:
I looked at your question again and realized neither of us actually
produced what you are looking for. modification to section c or another
query will be needed to produce the results you are looking for.
 
Deb:
I looked at your question again and realized neither of us actually
produced what you are looking for. modification to section c or another
query will be needed to produce the results you are looking for.

For me, the OP's spec wasn't that clear e.g. annotation in the sample
data posted is a bit ambiguous: for 4995-4996 the annotation appears
beside the first number whereas for 1234-1236 the annotation appears
next to the second/middle value. I decided they essentially want to
identify blocks and their start and end numbers hence my more 'general'
solution.
my section a. Can you
extend it to pair the startnumber and endnumber in the same record (and
in one query)?
I'm incline to say yes, but
not incline to try to do it.

I like a challenge <g>. If you are interested in specifics, let's
assume for *every* number in the table they want to know the block's
start and end values, with nulls if the value is not in a block. In one
query? OK then, and this time a more 'relational' approach:

SELECT T2.data_col, (
SELECT MAX(StartValues.start_value)
FROM (
SELECT T1.data_col AS start_value
FROM Test AS T1
WHERE NOT EXISTS (
SELECT *
FROM Test
WHERE data_col = T1.data_col - 1
) AND EXISTS (
SELECT *
FROM Test
WHERE data_col = T1.data_col + 1
)
) AS StartValues
WHERE StartValues.start_value <= T2.data_col
AND (
EXISTS (
SELECT *
FROM Test
WHERE data_col = T2.data_col - 1
) OR EXISTS (
SELECT *
FROM Test
WHERE data_col = T2.data_col + 1
)
)
) AS start_value, (
SELECT MIN(EndValues.end_value)
FROM (
SELECT T1.data_col AS end_value
FROM Test AS T1
WHERE EXISTS (
SELECT *
FROM Test
WHERE data_col = T1.data_col - 1
) AND NOT EXISTS (
SELECT *
FROM Test
WHERE data_col = T1.data_col + 1
)
) AS EndValues
WHERE EndValues.end_value >= T2.data_col
AND (
EXISTS (
SELECT *
FROM Test
WHERE data_col = T2.data_col - 1
) OR EXISTS (
SELECT *
FROM Test
WHERE data_col = T2.data_col + 1
)
)
) AS end_value
FROM Test AS T2;

Jamie.

--
 
Thanks so much for your feedback. Haven't had a chance to look at the dialog
and suggestions provided.

Jamie - the 'general solution' you provided...I'd like to figure out the
sequel as I think it'll provide me with what I need. However, while I use
Access all the time, I'm not used to creating queries this involved and have
not used visual basic. So, I kind of need help with the basics.

1. It looks like three tables are used; Test, T1, T2. But, am confused as to
what these represent. I'm starting out with one table only - with one column
of numbers (Test).
2. Is this meant to be entered as SQL in one query?

I'm unsure of how to get started. So, if there's some feedback you can
provide that would help me utilize the SQL provided, I'd really appreciate it.
 
1. It looks like three tables are used; Test, T1, T2. But, am confused as to
what these represent. I'm starting out with one table only - with one column
of numbers (Test).

These are three *aliases* for the same table. He's including the Test
table three times in the query, calling one instance Test, the second
T1, the third T2. This is called a "Self Join" query, since the table
is being joined to itself; to do this, you need to assign a different
name to each instance of the table so you can refer to the fields
unambiguously.
2. Is this meant to be entered as SQL in one query?

Yes.

John W. Vinson[MVP]
 
Got it! Thanks so much!

John Vinson said:
These are three *aliases* for the same table. He's including the Test
table three times in the query, calling one instance Test, the second
T1, the third T2. This is called a "Self Join" query, since the table
is being joined to itself; to do this, you need to assign a different
name to each instance of the table so you can refer to the fields
unambiguously.


Yes.

John W. Vinson[MVP]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top