Query of Unique Folders

A

AGP

I have a database with a field that contains a string with folder paths like
so:

[fldPath]
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderB\

I'm trying to formulate a query (based against an MS Access database) which
will give me the number of unique folders. In my example above I should get
back 6 unique folders or nodes.

folderA\
folderB\
folderC\
folder1\
folderA\
folderB\

I'm going to do a bit of research but if anyone has tackled something like
this I would appreciate any hints.

AGP
 
A

AGP

Thanks but this would only give me the unique strings and not the actual
count of the unioque nodes. I suspect that yes i will have to use DISTINCT
but just using it on the filed will yield
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderB\

A count of 4 distinct strings. however what I need is the count of the
actual unique nodes which is 6. Remeber that the strings are paths made up
of folder nodes.

AGP

Roger Carlson said:
SELECT DISTINCT fldPath from tblFolders;

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L

AGP said:
I have a database with a field that contains a string with folder paths
like so:

[fldPath]
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderB\

I'm trying to formulate a query (based against an MS Access database)
which will give me the number of unique folders. In my example above I
should get back 6 unique folders or nodes.

folderA\
folderB\
folderC\
folder1\
folderA\
folderB\

I'm going to do a bit of research but if anyone has tackled something
like this I would appreciate any hints.

AGP
 
R

Roger Carlson

Sorry, I didn't read carefully enough.

You cannot QUERY this information from your database. The reason is you are
storing multiple values in each field and so your data is not normalized.
You would have to write code to parse each string, keep track of each node
(perhaps in an array), but it can't be done in SQL, as far as I can see.

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L


AGP said:
Thanks but this would only give me the unique strings and not the actual
count of the unioque nodes. I suspect that yes i will have to use DISTINCT
but just using it on the filed will yield
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderB\

A count of 4 distinct strings. however what I need is the count of the
actual unique nodes which is 6. Remeber that the strings are paths made up
of folder nodes.

AGP

Roger Carlson said:
SELECT DISTINCT fldPath from tblFolders;

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L

AGP said:
I have a database with a field that contains a string with folder paths
like so:

[fldPath]
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderB\

I'm trying to formulate a query (based against an MS Access database)
which will give me the number of unique folders. In my example above I
should get back 6 unique folders or nodes.

folderA\
folderB\
folderC\
folder1\
folderA\
folderB\

I'm going to do a bit of research but if anyone has tackled something
like this I would appreciate any hints.

AGP
 
A

AGP

Hmm, yeah i was struggling how I could this in one SQL pass but it seems
almost impossible. Like you said I think my best bet would be to iterate
through the table first by using DISTINCT to narrow down the data and then
code a routine to cycle through the data and calculate the folder nodes.

Thanx
AGP


Roger Carlson said:
Sorry, I didn't read carefully enough.

You cannot QUERY this information from your database. The reason is you
are storing multiple values in each field and so your data is not
normalized. You would have to write code to parse each string, keep track
of each node (perhaps in an array), but it can't be done in SQL, as far as
I can see.

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L


AGP said:
Thanks but this would only give me the unique strings and not the actual
count of the unioque nodes. I suspect that yes i will have to use
DISTINCT but just using it on the filed will yield
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderB\

A count of 4 distinct strings. however what I need is the count of the
actual unique nodes which is 6. Remeber that the strings are paths made
up of folder nodes.

AGP

Roger Carlson said:
SELECT DISTINCT fldPath from tblFolders;

--
--Roger Carlson
MS Access MVP
Access Database Samples: www.rogersaccesslibrary.com
Want answers to your Access questions in your Email?
Free subscription:
http://peach.ease.lsoft.com/scripts/wa.exe?SUBED1=ACCESS-L

I have a database with a field that contains a string with folder paths
like so:

[fldPath]
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderA\
c:\folderA\folderB\folderC\folder1\folderB\

I'm trying to formulate a query (based against an MS Access database)
which will give me the number of unique folders. In my example above I
should get back 6 unique folders or nodes.

folderA\
folderB\
folderC\
folder1\
folderA\
folderB\

I'm going to do a bit of research but if anyone has tackled something
like this I would appreciate any hints.

AGP
 
J

James A. Fortune

AGP said:
Hmm, yeah i was struggling how I could this in one SQL pass but it seems
almost impossible. Like you said I think my best bet would be to iterate
through the table first by using DISTINCT to narrow down the data and then
code a routine to cycle through the data and calculate the folder nodes.

Thanx
AGP

That's a fun problem. I tried a way to do the folder node parsing. I
used the following recursive module code:

Public Function GetNthParentDirectory(strPath As String, intGenerations
As Integer) As String
GetNthParentDirectory = ""
If strPath = "" Then Exit Function
If intGenerations = 0 Then
GetNthParentDirectory = strPath
ElseIf intGenerations = 1 Then
GetNthParentDirectory = Get1stParentDirectory(strPath)
Else
GetNthParentDirectory =
GetNthParentDirectory(Get1stParentDirectory(strPath), intGenerations - 1)
End If
End Function

Get1stParentDirectory is a separate function that gets the direct parent
unless it's "C:\"; in that case it returns "" since your example doesn't
count it (the function is left as an exercise or to Google).

along with the following tables:

tblIntegers
ID Autonumber
theInt Long
ID theInt
1 1
2 2
3 3
4 4
5 5
6 6
.... up to at least one more than the maximum number of levels of proper
subdirectories

tblUniquePaths
USID AutoNumber
DirectoryPath Text
USID DirectoryPath
1 C:\folderA\folderB\folderC\
2 C:\folderA\folderB\folderC\folder1\
3 C:\folderA\folderB\folderC\folder1\folderA\
4 C:\folderA\folderB\folderC\folder1\folderB\

and the following SQL:

qryFamilyHistory:
SELECT tblUniquePaths.DirectoryPath, tblIntegers.theInt-1 AS
nGenerations, GetNthParentDirectory(DirectoryPath,nGenerations) AS
TheFamilyHistory FROM tblUniquePaths, tblIntegers WHERE
tblIntegers.theInt<=6;

qryDistinctFamilyHistory:
SELECT DISTINCT TheFamilyHistory AS DistinctFamilyHistory FROM
qryFamilyHistory WHERE TheFamilyHistory <> "";

qryCountDistinctNodes:
SELECT Count(DistinctFamilyHistory) AS NumberOfDistinctNodes FROM
qryDistinctFamilyHistory;

!qryFamilyHistory:
DirectoryPath nGenerations TheFamilyHistory
C:\folderA\folderB\folderC\ 0 C:\folderA\folderB\folderC\
C:\folderA\folderB\folderC\ 1 C:\folderA\folderB\
C:\folderA\folderB\folderC\ 2 C:\folderA\
C:\folderA\folderB\folderC\ 3 ""
C:\folderA\folderB\folderC\ 4 ""
C:\folderA\folderB\folderC\ 5 ""
C:\folderA\folderB\folderC\folder1\ 0
C:\folderA\folderB\folderC\folder1\ 1 C:\folderA\folderB\folderC\
C:\folderA\folderB\folderC\folder1\ 2 C:\folderA\folderB\
....

!qryDistinctFamilyHistory:
DistinctFamilyHistory
C:\folderA\
C:\folderA\folderB\
C:\folderA\folderB\folderC\
C:\folderA\folderB\folder1\
C:\folderA\folderB\folder1\folderA\
C:\folderA\folderB\folder1\folderB\

!qryCountDistinctNodes:
NumberOfDistinctNodes
6

I might try later to cut down the number of queries (i.e., attempt the
almost impossible), but I agree that a single SQL statement here looks
pretty tough. On top of that, A97, which I use for most query
development, has an even harder time with subqueries containing the
DISTINCT keyword than later versions of Access. Besides, if performance
becomes a factor, queries based on other queries can be sped up much
more easily than queries containing subqueries by using indexed
temporary tables. Sometimes, though, you can't beat the convenience of
a single SQL statement.

James A. Fortune
(e-mail address removed)
 
A

AGP

holy smokes. thanks for the routine. let me take some time to digest this
and study the way it works.
ill report back my findings.

AGP
 
J

Jamie Collins

I might try later to cut down the number of queries (i.e., attempt the
almost impossible), but I agree that a single SQL statement here looks
pretty tough.

Sometimes, though, you can't beat the convenience of
a single SQL statement.

Actually, you're going to kick yourself when I tell you, but it is
easy to do in a single statement (and no subqueries!) if you use the
standard trick of a table of unique integers named Sequence (limited
as appropriate) and Jet's MID() expression to find the folder-
delimiting characters:

SELECT DISTINCT MID(T1.fldPath, 1, S1.seq) AS node
FROM Folders AS T1, Sequence AS S1
WHERE MID(T1.fldPath, S1.seq, 1) = '\';

Here's some SQL to recreate the OP's scenario and demo the above
suggestion:

Sub ParseOnNodes()
Kill Environ$("temp") & "\DropMe.mdb"
Dim cat
Set cat = CreateObject("ADOX.Catalog")
With cat
..Create _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
With .ActiveConnection
Dim Sql As String
Sql = _
"CREATE TABLE Folders (fldPath" & _
" VARCHAR(255) NOT NULL)"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\')"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\')"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\')"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\')"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\')"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\folderA\')"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\folderA\')"
..Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\folderB\')"
..Execute Sql
Sql = _
"CREATE TABLE Sequence (seq INTEGER" & _
" NOT NULL UNIQUE)"
..Execute Sql
Sql = _
"INSERT INTO Sequence (seq)" & vbCr & "SELECT" & _
" Units.nbr + Tens.nbr + Hundreds.nbr" & _
" AS seq " & vbCr & "FROM (SELECT DISTINCT" & _
" nbr FROM (SELECT DISTINCT 0 AS" & _
" nbr FROM Folders UNION ALL SELECT" & _
" DISTINCT 1 FROM Folders UNION" & _
" ALL SELECT DISTINCT 2 FROM Folders" & _
" UNION ALL SELECT DISTINCT 3 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 4 FROM Folders UNION ALL SELECT" & _
" DISTINCT 5 FROM Folders UNION" & _
" ALL SELECT DISTINCT 6 FROM Folders" & _
" UNION ALL SELECT DISTINCT 7 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 8 FROM Folders UNION ALL SELECT" & _
" DISTINCT 9 FROM Folders) AS Digits)" & _
" AS Units, (SELECT DISTINCT nbr" & _
" * 10 AS nbr FROM (SELECT DISTINCT" & _
" 0 AS nbr FROM Folders UNION ALL" & _
" SELECT DISTINCT 1 FROM Folders"
Sql = Sql & _
" UNION ALL SELECT DISTINCT 2 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 3 FROM Folders UNION ALL SELECT" & _
" DISTINCT 4 FROM Folders UNION" & _
" ALL SELECT DISTINCT 5 FROM Folders" & _
" UNION ALL SELECT DISTINCT 6 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 7 FROM Folders UNION ALL SELECT" & _
" DISTINCT 8 FROM Folders UNION" & _
" ALL SELECT DISTINCT 9 FROM Folders)" & _
" AS Digits) AS Tens, (SELECT DISTINCT" & _
" nbr * 100 AS nbr FROM (SELECT" & _
" DISTINCT 0 AS nbr FROM Folders" & _
" UNION ALL SELECT DISTINCT 1 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 2 FROM Folders UNION ALL SELECT" & _
" DISTINCT 3 FROM Folders UNION" & _
" ALL SELECT DISTINCT 4 FROM Folders" & _
" UNION ALL SELECT DISTINCT 5 FROM" & _
" Folders UNION ALL SELECT DISTINCT"
Sql = Sql & _
" 6 FROM Folders UNION ALL SELECT" & _
" DISTINCT 7 FROM Folders UNION" & _
" ALL SELECT DISTINCT 8 FROM Folders" & _
" UNION ALL SELECT DISTINCT 9 FROM" & _
" Folders) AS Digits) AS Hundreds" & vbCr & "WHERE" & _
" Units.nbr + Tens.nbr + Hundreds.nbr" & _
" BETWEEN 1 AND 255"
..Execute Sql
Sql = _
"SELECT DISTINCT MID(T1.fldPath," & _
" 1, S1.seq) AS node" & vbCr & "FROM Folders" & _
" AS T1" & vbCr & "INNER JOIN Sequence AS" & _
" S1" & vbCr & "ON (S1.seq BETWEEN 1 AND LEN(T1.fldPath))" & vbCr &
"WHERE" & _
" MID(T1.fldPath, S1.seq, 1) =" & _
" '\'"
Dim rs
Set rs = .Execute(Sql)
MsgBox rs.GetString
End With
Set .ActiveConnection = Nothing
End With
End Sub

Jamie.

--
 
J

James A. Fortune

Jamie said:
Actually, you're going to kick yourself when I tell you, but it is
easy to do in a single statement (and no subqueries!) if you use the
standard trick of a table of unique integers named Sequence (limited
as appropriate) and Jet's MID() expression to find the folder-
delimiting characters:

SELECT DISTINCT MID(T1.fldPath, 1, S1.seq) AS node
FROM Folders AS T1, Sequence AS S1
WHERE MID(T1.fldPath, S1.seq, 1) = '\';

Here's some SQL to recreate the OP's scenario and demo the above
suggestion:

Sub ParseOnNodes()
Kill Environ$("temp") & "\DropMe.mdb"
Dim cat
Set cat = CreateObject("ADOX.Catalog")
With cat
.Create _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & _
Environ$("temp") & "\DropMe.mdb"
With .ActiveConnection
Dim Sql As String
Sql = _
"CREATE TABLE Folders (fldPath" & _
" VARCHAR(255) NOT NULL)"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\')"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\')"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\')"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\')"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\')"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\folderA\')"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\folderA\')"
.Execute Sql
Sql = _
"INSERT INTO Folders (fldPath)" & _
" VALUES ('c:\folderA\folderB\folderC\folder1\folderB\')"
.Execute Sql
Sql = _
"CREATE TABLE Sequence (seq INTEGER" & _
" NOT NULL UNIQUE)"
.Execute Sql
Sql = _
"INSERT INTO Sequence (seq)" & vbCr & "SELECT" & _
" Units.nbr + Tens.nbr + Hundreds.nbr" & _
" AS seq " & vbCr & "FROM (SELECT DISTINCT" & _
" nbr FROM (SELECT DISTINCT 0 AS" & _
" nbr FROM Folders UNION ALL SELECT" & _
" DISTINCT 1 FROM Folders UNION" & _
" ALL SELECT DISTINCT 2 FROM Folders" & _
" UNION ALL SELECT DISTINCT 3 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 4 FROM Folders UNION ALL SELECT" & _
" DISTINCT 5 FROM Folders UNION" & _
" ALL SELECT DISTINCT 6 FROM Folders" & _
" UNION ALL SELECT DISTINCT 7 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 8 FROM Folders UNION ALL SELECT" & _
" DISTINCT 9 FROM Folders) AS Digits)" & _
" AS Units, (SELECT DISTINCT nbr" & _
" * 10 AS nbr FROM (SELECT DISTINCT" & _
" 0 AS nbr FROM Folders UNION ALL" & _
" SELECT DISTINCT 1 FROM Folders"
Sql = Sql & _
" UNION ALL SELECT DISTINCT 2 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 3 FROM Folders UNION ALL SELECT" & _
" DISTINCT 4 FROM Folders UNION" & _
" ALL SELECT DISTINCT 5 FROM Folders" & _
" UNION ALL SELECT DISTINCT 6 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 7 FROM Folders UNION ALL SELECT" & _
" DISTINCT 8 FROM Folders UNION" & _
" ALL SELECT DISTINCT 9 FROM Folders)" & _
" AS Digits) AS Tens, (SELECT DISTINCT" & _
" nbr * 100 AS nbr FROM (SELECT" & _
" DISTINCT 0 AS nbr FROM Folders" & _
" UNION ALL SELECT DISTINCT 1 FROM" & _
" Folders UNION ALL SELECT DISTINCT" & _
" 2 FROM Folders UNION ALL SELECT" & _
" DISTINCT 3 FROM Folders UNION" & _
" ALL SELECT DISTINCT 4 FROM Folders" & _
" UNION ALL SELECT DISTINCT 5 FROM" & _
" Folders UNION ALL SELECT DISTINCT"
Sql = Sql & _
" 6 FROM Folders UNION ALL SELECT" & _
" DISTINCT 7 FROM Folders UNION" & _
" ALL SELECT DISTINCT 8 FROM Folders" & _
" UNION ALL SELECT DISTINCT 9 FROM" & _
" Folders) AS Digits) AS Hundreds" & vbCr & "WHERE" & _
" Units.nbr + Tens.nbr + Hundreds.nbr" & _
" BETWEEN 1 AND 255"
.Execute Sql
Sql = _
"SELECT DISTINCT MID(T1.fldPath," & _
" 1, S1.seq) AS node" & vbCr & "FROM Folders" & _
" AS T1" & vbCr & "INNER JOIN Sequence AS" & _
" S1" & vbCr & "ON (S1.seq BETWEEN 1 AND LEN(T1.fldPath))" & vbCr &
"WHERE" & _
" MID(T1.fldPath, S1.seq, 1) =" & _
" '\'"
Dim rs
Set rs = .Execute(Sql)
MsgBox rs.GetString
End With
Set .ActiveConnection = Nothing
End With
End Sub

Jamie.

I think I understand what you're saying. Whereas my integer table was
for the number of levels of subdirectory, your integer table is used to
scan each character of the entire path and can be limited to the number
of characters in the longest directory name. The second MID function
limits the candidates only to those (sub)strings ending in '\' for each
Folder and the first MID function uses the position of the final '\' to
determine the (sub)string itself. Nicely done. I won't kick myself
because it only looks obvious in hindsight. I'll remember your idea
when similar situations present themselves.

James A. Fortune
(e-mail address removed)
 
J

Jamie Collins

your integer table is used to
scan each character of the entire path and can be limited to the number
of characters in the longest directory name. The second MID function
limits the candidates only to those (sub)strings ending in '\' for each
Folder and the first MID function uses the position of the final '\' to
determine the (sub)string itself.

Remember, SQL is a set-based language e.g. it does not "scan each
character" -- you are thinking in terms of procedural code :)

My take is that the SQL engine considers every truncated (to the
right) instance of the folder name, which is potentially a very large
set being the sum of the character length of every folder name in the
set, and reduces the resultset by appling the DISTINCT keyword to
remove duplicate rows and applying the search condition to remove rows
where the last character is not '\'.

Jamie.

--
 
J

James A. Fortune

Jamie said:
Remember, SQL is a set-based language e.g. it does not "scan each
character" -- you are thinking in terms of procedural code :)

My take is that the SQL engine considers every truncated (to the
right) instance of the folder name, which is potentially a very large
set being the sum of the character length of every folder name in the
set, and reduces the resultset by appling the DISTINCT keyword to
remove duplicate rows and applying the search condition to remove rows
where the last character is not '\'.

Jamie.

Jamie,

I'm still having fun with this thread. I've had some time to think over
the problem and I have a few more observations. First (Firstly for the
British), my recursive function could just as easily have been a simple
For loop with intGenerations calls to Get1stParentDirectory. The
parsing problem was not of sufficient complexity to justify the need for
a recursive solution. Second, a single table with a single subquery
seems to be almost the computational equivalent to what's called a
cross-product query with WHERE criteria. Third, I think you knew what I
meant by "scan each character" but it was good that you elaborated the
steps from SQL's perspective. I still think that "scan each character"
is the best way to think of WHERE MID(T1.fldPath, S1.seq, 1) = '\'.
Other problems where your technique might prove helpful include
something similar to the Split() function that produces ordinary query
output rather than an array and a way to count substrings in SQL without
using a UDF.

For example:

(same integer table as before)

tblStringToParse
SID AutoNumber
StringToParse Text
SID StringToParse
1 aa;bbb;cccc;dddd
2 bbb;eee

qrySplit:
SELECT DISTINCT MID(";" & StringToParse & ";", tblIntegers.theInt + 1,
tblIntegers_1.theInt - tblIntegers.theInt - 1) AS theSubString FROM
tblIntegers, tblIntegers AS tblIntegers_1, tblStringToParse WHERE
MID(";" & StringToParse & ";", tblIntegers.theInt, 1) = ";" AND MID(";"
& StringToParse & ";", tblIntegers_1.theInt, 1) = ";" AND
tblIntegers.theInt < tblIntegers_1.theInt AND InStr(1, MID(";" &
StringToParse & ";", tblIntegers.theInt + 1, tblIntegers_1.theInt -
tblIntegers.theInt - 1), ";") = 0;

!qrySplit:
aa
bbb
cccc
dddd
eee

The criteria for qrySplit are that the integers are restricted to
positions where a ';' occurs, that the first position is before the
second position and that the substring does not contain the delimiter.

For the substrings problem:

tblString
SID AutoNumber
theString a;sdlkjf;aslkdjfa;slkjf

qrySubstrings:
SELECT tblString.theString, tblIntegers.theInt, tblIntegers_1.theInt
FROM tblString, tblIntegers, tblIntegers AS tblIntegers_1 WHERE
tblIntegers.theInt < tblIntegers_1.theInt AND MID(tblString.theString,
tblIntegers.theInt, tblIntegers_1.theInt - tblIntegers.theInt + 1) = "a;s";

!qrySubstrings:
theString tblIntegers.theInt tblIntegers_1.theInt
a;sdlkjf;aslkdjfa;slkjf 1 3
a;sdlkjf;aslkdjfa;slkjf 17 19

The second example might not match up to other methods as far as
efficiency, but the elegant method for counting substrings using the
Replace() function may not be efficient either. Besides, what if you
have "abcabcabcabcabc" and want to find how many "abca" substrings you
have? Do you count the four overlapping "abca" strings or just the two
nonoverlapping ones? The Replace() function method can only count the
two nonoverlapping ones. The SQL method can do either by including or
excluding an additional criterion to prevent overlapping theInt ranges.

James A. Fortune
(e-mail address removed)
 
J

Jamie Collins

First (Firstly for the
British),

In "Plain English" Britain we drink Yorkshire Tea nowadays and tend to
say "First" rather than "Firstly" [just waiting for the firstly/
thirsty John Vinson pun], "while" rather than "whilst", "potato"
rather than "patata" etc.
my recursive function could just as easily have been a simple
For loop

Actually, I was impressed you'd thought to use a recursive function
because it's something I don't always pick upon myself :)
Second, a single table with a single subquery
seems to be almost the computational equivalent to what's called a
cross-product query with WHERE criteria

I call it an "old style [inner] join" because it is how things were
done before the days the SQL-89 Standard (FWIW the term 'Cartesian
product' is considered deprecated since SQL-92 introduced the CROSS
JOIN syntax). Did you notice that my SQL in my VBA equivalent used the
INNER JOIN syntax? i.e.

Folders AS T1
INNER JOIN Sequence AS S1
ON (S1.seq BETWEEN 1 AND LEN(T1.fldPath))
I think you knew what I
meant by "scan each character" but it was good that you elaborated the
steps from SQL's perspective.

I think you are correct :)
Other problems where your technique might prove helpful include
something similar to the Split() function that produces ordinary query
output rather than an array and a way to count substrings in SQL without
using a UDF.

Yes, I have a ready-rolled generic Jet SQL stored proc for this, using
a remarkably similar syntax to yours so I guess we both plagiarised
the same source <vbg>

PLEASE READ THE BELOW WARNING BEFORE EXECUTING THIS PROC:

CREATE PROCEDURE ListToTable
(
delimted_text MEMO,
delimiter VARCHAR(4) = ','
)
AS
SELECT MID(I1.input_string, S1.seq, MIN(S2.seq) - S1.seq -
LEN(delimiter)) AS param
FROM
(
SELECT DISTINCT delimted_text AS input_string
FROM Sequence AS S3
WHERE S3.seq BETWEEN 1 AND LEN(delimted_text)
) AS I1, Sequence AS S1, Sequence AS S2
WHERE MID(delimiter & I1.input_string & delimiter, S1.seq,
LEN(delimiter)) = delimiter
AND MID(delimiter & I1.input_string & delimiter, S2.seq,
LEN(delimiter)) = delimiter
AND S1.seq < S2.seq
AND S1.seq BETWEEN 1 AND LEN(delimiter) + LEN(delimted_text) +
LEN(delimiter)
AND S2.seq BETWEEN 1 AND LEN(delimiter) + LEN(delimted_text) +
LEN(delimiter)
GROUP BY I1.input_string, S1.seq
HAVING LEN(MID(I1.input_string, S1.seq, MAX(S2.seq) - S1.seq -
LEN(delimiter))) > 0;

However, I've since abandoned it because of the usual story i.e. Jet
limitations. For example, this works find (note the length of the
first parameter):

EXECUTE ListToTable '12345678A;12345678B;12345678C;12345678D;12345678E;
12345678F;12345678G;12345678H;12345678I;12345678J;12345678K;12345678L;
12345678M;12345678N;12345678O;12345678P;12345678Q;12345678R;12345678S;
12345678T;12345678U;12345678V;12345678W;12345678X;12345678Y;12345',
';'

However, any more characters (i.e. more than 255) and the whole thing
blows up i.e. the host application crashes with "The object invoked
has disconnected from its clients" -- ouch! Maybe I coded something
wrong (as ever, corrections welcomed) but it is one of those things
that make me wonder whether Jet is fit for purpose...

Jamie.

--
 
J

James A. Fortune

Jamie said:
First (Firstly for the
British),


In "Plain English" Britain we drink Yorkshire Tea nowadays and tend to
say "First" rather than "Firstly" [just waiting for the firstly/
thirsty John Vinson pun], "while" rather than "whilst", "potato"
rather than "patata" etc.

Although what you consider proper English has the capacity for "Plain
English," many still do not avail themselves of that capability.
Actually, I was impressed you'd thought to use a recursive function
because it's something I don't always pick upon myself :)

A recursive function came to mind first because directory paths
correspond to trees and tree traversal often lends itself :) to
recursive functions.
Second, a single table with a single subquery
seems to be almost the computational equivalent to what's called a
cross-product query with WHERE criteria


I call it an "old style [inner] join" because it is how things were
done before the days the SQL-89 Standard (FWIW the term 'Cartesian
product' is considered deprecated since SQL-92 introduced the CROSS
JOIN syntax). Did you notice that my SQL in my VBA equivalent used the
INNER JOIN syntax? i.e.

Folders AS T1
INNER JOIN Sequence AS S1
ON (S1.seq BETWEEN 1 AND LEN(T1.fldPath))

You're correct. Mea culpa there. My eyes glazed over a bit at the
seventh UNION ALL making me miss the INNER JOIN. Using BETWEEN and
LEN() in the ON expression to relate an integer index (not the table
kind) with its corresponding string is a nice idea.
Yes, I have a ready-rolled generic Jet SQL stored proc for this, using
a remarkably similar syntax to yours so I guess we both plagiarised
the same source <vbg>

Thanks, Jamie. I have tremendous respect for your SQL abilities.
However, any more characters (i.e. more than 255) and the whole thing
blows up i.e. the host application crashes with "The object invoked
has disconnected from its clients" -- ouch! Maybe I coded something
wrong (as ever, corrections welcomed) but it is one of those things
that make me wonder whether Jet is fit for purpose...

Ouch indeed.

James A. Fortune
(e-mail address removed)
 
J

James A. Fortune

Jamie said:
Google the keywords parmlist inputstrings celko.

I just did. The search shows that neither of us plagiarized :).
Although the germ of what you did was posted previously, your stored
procedure took the best of multiple ideas and put them together in a way
that was not there before -- at least not in the Google links. I think
our discussion was fruitful.

James A. Fortune
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top