Databases, Typed Datasets, and Flat Files oh my!

B

Ben

So, at my place of employment, we use a national standard to transmit
data between certain applications. This standard consists of a fixed
width, flat file 4500-some-odd chars wide that contain 375-some-odd
columns (Not delimited). This standard is not changing anytime soon, so
I attempting to write a library of some sort to read and write these
files within our SQL databases (which use 160 of those columns, but
completely ignore the rest).

Now, the way I would attempt to understand this is: Flat file gets
sucked into dataset, dataset gets shoved into database (or vicea
versa). So... All of the stuff I see for typed datasets is "XML! XML!
XML!" but I'm just looking at a way of specifying a dataset in which
there is a table that says "Field1 is chars 1-10, Field2 is chars
10-14..." etc etc etc.

So, people infinitely more talented then me, how would you attempt to
do this?
 
M

Marina

As far as I know, there is no way to do this other then to manually read the
file, get the right characters, and then manually add the rows to the
dataset or database. There is no built in mechanism I am aware of us to
process these kinds of files.
 
C

Cor Ligthert [MVP]

Ben,

I don't think that it will give you more effort than placing it directly in
your sqlsever with a sql insert command and a sqlnonquery.

However you can get the schema from your datatabase with

http://msdn.microsoft.com/library/d...datacommondataadapterclassfillschematopic.asp

Than you can use the normal mid or substring commands as you now probably
already use to to fill the dataitems in the new created rows.

If you have less than 100 columns, than you can use the commandbuilder to
create the commands and update.

I hope this helps,

Cor
 
A

AMDRIT

I would do something like this: (Not in anyway tested.! And could be
seriously optimized and featured.)

Public Class FlatFile

Event ReadProgress(ByVal PercentComplete As Integer)
Event ReadComplete()
Event ReadFailed()
Event ReadStarted()

'Reads a flat file based on a template
'Template file looks like
'Just one line with the column size
'15,30,10,1,3,4,5,6,9

Private m_DBFileName As String
Private m_SFileName As String
Private m_DataRows As ArrayList
Private m_StreamReader As System.IO.StreamReader

Private m_HasRead As Boolean

Public Sub New(ByVal DBFileName As String, ByVal SchemaFileName As
String, ByVal Create As Boolean)

'Validate DB file exits
If System.IO.File.Exists(DBFileName) Then
Else
If Not Create Then Throw New ApplicationException("Bad DB File
Name")
End If

'remember the DB file location
m_DBFileName = DBFileName

'Validate Schema file exits
If System.IO.File.Exists(SchemaFileName) Then
'remember the Schema file location
m_SFileName = SchemaFileName
Else
Throw New ApplicationException("Bad Schema File Name")
End If

End Sub

Public Sub New(ByVal DBFileName As String, ByVal SchemaFileName As
String)

MyClass.New(DBFileName, SchemaFileName, False)

End Sub

Private ReadOnly Property DBReader() As System.IO.StreamReader
Get
If m_StreamReader Is Nothing Then
m_StreamReader = New System.IO.StreamReader(m_DBFileName)
End If
Return m_StreamReader
End Get
End Property

Private ReadOnly Property Schema() As Integer()
Get

Dim objTempReader As System.IO.StreamReader
Dim sLine As String
Dim asEntries() As String
Dim aiEntries() As Integer

objTempReader = New System.IO.StreamReader(m_SFileName)

sLine = objTempReader.ReadLine()

objTempReader.Close()

objTempReader = Nothing

If Not sLine Is Nothing Then
asEntries = sLine.Trim.Split(",")

ReDim aiEntries(asEntries.Length - 1)

For i As Integer = 0 To asEntries.Length - 1
aiEntries(i) = CType(asEntries(i), Integer)
Next

End If

End Get
End Property

Public Sub ReadData()

'Read it only once
If m_HasRead Then Exit Sub

Dim sLine As String

'File Related
Dim oFileInfo As System.IO.FileInfo
Dim iFileSize As Integer

'Db Related
Dim iRowSize As Integer
Dim iRowCount As Integer
Dim aCols() As Integer

'Get file size
oFileInfo = New System.IO.FileInfo(m_DBFileName)
iFileSize = oFileInfo.Length

'Load our schema
aCols = Me.Schema

'Determine row size
For Each i As Integer In aCols
iRowSize += i
Next

'Determine row count
iRowCount = Int(iFileSize / iRowCount)


RaiseEvent ReadStarted()

sLine = DBReader.ReadLine
ParseLine(sLine)

Do While Not sLine Is Nothing
sLine = DBReader.ReadLine
ParseLine(sLine)
Loop

RaiseEvent ReadComplete()

m_HasRead = True

End Sub

Private Sub ParseLine(ByVal sLine As String)

Dim Record() As String
Dim aCols() As Integer
Dim iPos As Integer

aCols = Me.Schema

ReDim Record(aCols.Length - 1)

For i As Integer = 0 To aCols.Length - 1
Record(i) = sLine.Substring(iPos, aCols(i))
iPos += aCols(i)
Next

m_DataRows.Add(Record)

End Sub

Public Function Add(ByVal Value() As String) As Integer

ValidateRow(Value)

Return m_DataRows.Add(Value)

End Function

Public Property Row(ByVal Index As Integer) As String()
Get
Return m_DataRows(Index)
End Get
Set(ByVal Value() As String)

ValidateRow(Value)

m_DataRows(Index) = Value

End Set
End Property

Private Sub ValidateRow(ByVal Value() As String)
Dim aCols() As Integer

aCols = Me.Schema

'Validate the updated values
If aCols.Length <> Value.Length Then
Throw New ApplicationException(String.Format("Invalid Row Length"))
End If

For i As Integer = 0 To aCols.Length - 1

If Value(i).Length > aCols(i) Then
Throw New ApplicationException(String.Format("Invalid Data Length:
Max Data Lenght for column{0} is {1}", i, aCols(i)))
ElseIf Value(i).Length > aCols(i) Then
'Pad the data with spaces
Value(i) = Value(i).PadRight(aCols(i), " ")
End If

Next
End Sub

Public Sub SaveDB()
Dim sw As System.IO.StreamWriter
Dim sLine As System.Text.StringBuilder

'Close our read connection
m_StreamReader.Close()
m_StreamReader = Nothing

'Delete our old file
System.IO.File.Delete(m_DBFileName)

sw = New System.IO.StreamWriter(m_DBFileName)

For Each s() As String In m_DataRows
sLine = New System.Text.StringBuilder
For i As Integer = 0 To s.Length - 1
sLine.Append(s(i))
Next
sw.WriteLine(sLine.ToString)
Next

sw.Close()

End Sub

Protected Overrides Sub Finalize()
MyBase.Finalize()

If Not m_StreamReader Is Nothing Then
m_StreamReader.Close()
End If

m_StreamReader = Nothing

End Sub

End Class
 
A

AMDRIT

I modified it a bit

Public Class FlatFile

Event ReadProgress(ByVal PercentComplete As Integer)
Event ReadComplete()
Event ReadFailed()
Event ReadStarted()
Event SaveComplete()

'Reads a flat file based on a template
' 1. A non full Template file looks like
' 15,30,10,1,3,4,5,6,9
' 2. A full Schema Template looks like
' ~ColumnName, PadDirection, DataLength
' OderID,2,9
' OrderDate,2,8

Private m_DBFileName As String
Private m_SFileName As String
Private m_FullSchema As Boolean
Private m_DataRows As ArrayList
Private m_HasRead As Boolean
Private m_aCols() As Column, m_ColCount As Integer

#Region " Column Descriptor"

Public Class Column

Public Enum PadDirections
pdLeft = 1
pdRight = 2
End Enum

Private m_Ordinal As Integer
Private m_ColumnName As String = String.Empty
Private m_PadDirection As PadDirections
Private m_Length As Integer

#Region " Constructors"

Public Sub New(ByVal Ordinal As Integer, ByVal ColumnName As String,
ByVal PadDirection As PadDirections, ByVal Length As Integer)
m_Ordinal = Ordinal
m_ColumnName = ColumnName
m_PadDirection = PadDirection
m_Length = Length
End Sub

Public Sub New(ByVal Ordinal As Integer, ByVal PadDirection As
PadDirections, ByVal Length As Integer)
MyClass.New(Ordinal, String.Format("Column{0}", Ordinal),
PadDirection, Length)
End Sub

Public Sub New(ByVal Ordinal As Integer, ByVal Length As Integer)
MyClass.New(Ordinal, String.Format("Column{0}", Ordinal),
PadDirections.pdRight, Length)
End Sub

#End Region

#Region " Fields"

Public ReadOnly Property Ordinal() As Integer
Get
Return m_Ordinal
End Get
End Property

Public ReadOnly Property ColumnName() As String
Get
Return m_ColumnName
End Get
End Property

Public ReadOnly Property PadDirection() As PadDirections
Get
Return m_PadDirection
End Get
End Property

Public ReadOnly Property Length() As Integer
Get
Return m_Length
End Get
End Property

#End Region

End Class

#End Region

#Region " Constructors"

Public Sub New(ByVal DBFileName As String, ByVal SchemaFileName As String,
ByVal FullSchema As Boolean, ByVal Create As Boolean)

'Validate DB file exits
If System.IO.File.Exists(DBFileName) Then
Else
If Not Create Then Throw New ApplicationException("Bad DB File Name")
End If

'remember the DB file location
m_DBFileName = DBFileName

'Validate Schema file exits
If System.IO.File.Exists(SchemaFileName) Then
'remember the Schema file location
m_SFileName = SchemaFileName
Else
Throw New ApplicationException("Bad Schema File Name")
End If

m_FullSchema = FullSchema

m_DataRows = New ArrayList

m_ColCount = -1

End Sub

Public Sub New(ByVal DBFileName As String, ByVal SchemaFileName As String,
ByVal FullSchema As Boolean)

MyClass.New(DBFileName, SchemaFileName, FullSchema, False)

End Sub

Public Sub New(ByVal DBFileName As String, ByVal SchemaFileName As String)

MyClass.New(DBFileName, SchemaFileName, False, False)

End Sub

#End Region

#Region " Data Row Implementation"

Public Function Add(ByVal Value() As String) As Integer

ValidateRow(Value)

Return m_DataRows.Add(Value)

End Function

Public Property Row(ByVal Index As Integer) As String()
Get
Return m_DataRows(Index)
End Get
Set(ByVal Value() As String)

ValidateRow(Value)

m_DataRows(Index) = Value

End Set
End Property

Public Function RowCount() As Integer
Return m_DataRows.Count
End Function

Public Sub Remove(ByVal Datarow() As String)
m_DataRows.Remove(Datarow)
End Sub

Public Sub RemoveAt(ByVal Index As Integer)
m_DataRows.RemoveAt(Index)
End Sub

Private Sub ValidateRow(ByVal Value() As String)
Dim aCols() As Column

aCols = Me.Schema

'Validate the updated values
If aCols.Length <> Value.Length Then
Throw New ApplicationException(String.Format("Invalid Row Length"))
End If

For i As Integer = 0 To aCols.Length - 1
If Value(i) = String.Empty Then
'Pad the data with spaces
If aCols(i).PadDirection = Column.PadDirections.pdLeft Then
Value(i) = Value(i).PadLeft(aCols(i).Length, " ")
Else
Value(i) = Value(i).PadRight(aCols(i).Length, " ")
End If
ElseIf Value(i).Length > aCols(i).Length Then
Throw New ApplicationException(String.Format("Invalid Data Length:
Max Data Lenght for column{0} is {1}", i, aCols(i)))
ElseIf Value(i).Length < aCols(i).Length Then
'Pad the data with spaces
If aCols(i).PadDirection = Column.PadDirections.pdLeft Then
Value(i) = Value(i).PadLeft(aCols(i).Length, " ")
Else
Value(i) = Value(i).PadRight(aCols(i).Length, " ")
End If
End If

Next
End Sub

#End Region

#Region " Column Implementation"

Public ReadOnly Property ColumnCount() As Integer
Get
Return m_ColCount
End Get
End Property

Public ReadOnly Property ColumnInfo(ByVal Index As Integer) As Column
Get
Return m_aCols(Index)
End Get
End Property

Private Sub AddColumnInfo(ByVal Column As Column)

m_ColCount += 1
ReDim Preserve m_aCols(m_ColCount)

m_aCols(m_ColCount) = Column

End Sub

#End Region

#Region " DB Operations"

Public Sub ReadData()

'Read it only once
If m_HasRead Then Exit Sub
If Not System.IO.File.Exists(m_DBFileName) Then Exit Sub

'
Dim sLine As String
Dim sr As System.IO.StreamReader
Dim iRowPos As Integer

'File Related
Dim oFileInfo As System.IO.FileInfo
Dim iFileSize As Integer

'Db Related
Dim iRowSize As Integer
Dim iRowCount As Integer
Dim aCols() As Column

'Get file size
oFileInfo = New System.IO.FileInfo(m_DBFileName)
iFileSize = oFileInfo.Length

'Load our schema
aCols = Me.Schema

'Determine row size
For Each i As Column In aCols
iRowSize += i.Length
Next

'Determine row count
iRowCount = Int(iFileSize / iRowSize)

RaiseEvent ReadStarted()

sr = New System.IO.StreamReader(m_DBFileName)

iRowPos = 0

sLine = sr.ReadLine
ParseLine(sLine)

Do While Not sLine Is Nothing

iRowPos += 1

RaiseEvent ReadProgress(Int((iRowPos / iRowCount) * 100))

sLine = sr.ReadLine
ParseLine(sLine)

Loop

RaiseEvent ReadComplete()

m_HasRead = True

End Sub

Public Sub SaveDB()

Dim sw As System.IO.StreamWriter
Dim sLine As System.Text.StringBuilder

'Delete our old file
System.IO.File.Delete(m_DBFileName)

sw = New System.IO.StreamWriter(m_DBFileName)

Try
For Each s() As String In m_DataRows
sLine = New System.Text.StringBuilder
For i As Integer = 0 To s.Length - 1
sLine.Append(s(i))
Next
sw.WriteLine(sLine.ToString)
Next
sw.Flush()
Catch ex As Exception
MsgBox(ex.ToString)
End Try

sw.Close()
sw = Nothing

RaiseEvent SaveComplete()

End Sub

Private Sub ParseLine(ByVal sLine As String)

'Don't attempt to read an empty line
If sLine Is Nothing Or sLine = String.Empty Then Exit Sub

Dim Record() As String
Dim aCols() As Column
Dim iPos As Integer

aCols = Me.Schema

ReDim Record(aCols.Length - 1)

For i As Integer = 0 To aCols.Length - 1
Record(i) = sLine.Substring(iPos, aCols(i).Length)
iPos += aCols(i).Length
Next

Add(Record)

End Sub

Private ReadOnly Property Schema() As Column()
Get

If m_aCols Is Nothing OrElse m_aCols.Length < 1 Then

Dim objTempReader As System.IO.StreamReader
Dim sLine As String

objTempReader = New System.IO.StreamReader(m_SFileName)

sLine = objTempReader.ReadLine()
ParseColumnInfo(sLine)

Do While Not sLine Is Nothing
sLine = objTempReader.ReadLine()
ParseColumnInfo(sLine)
Loop

objTempReader.Close()

objTempReader = Nothing

End If

Return m_aCols

End Get
End Property

Private Sub ParseColumnInfo(ByVal sLine As String)

Dim asEntries() As String

If Not sLine Is Nothing Then

asEntries = sLine.Trim.Split(",")

If Not m_FullSchema Then

asEntries = sLine.Trim.Split(",")

For i As Integer = 0 To asEntries.Length - 1
AddColumnInfo(New Column(i, CType(asEntries(i), Integer)))
Next

Else

AddColumnInfo(New Column(m_ColCount, CType(asEntries(0), String),
CType(asEntries(1), Integer), CType(asEntries(2), Integer)))

End If

End If

End Sub

#End Region

#Region " Destructors"
Protected Overrides Sub Finalize()
MyBase.Finalize()

If Not m_DataRows Is Nothing Then m_DataRows.Clear()
m_DataRows = Nothing

End Sub
#End Region

'Sample:

'Dim f As FlatFile

' f = New FlatFile("C:\Data.txt", "C:\Schema.txt", True, False)
' f.ReadData()

' For i As Integer = 0 To f.RowCount - 1
' console.writeline(String.Join("", f.Row(0)))
'
' f.Row(i)(0) = CType(Int((999999 - 1) * Rnd() + 1), String)
' f.Row(i)(1) = Now.ToShortDateString
' f.Row(i)(2) = "Some Customer"
' f.Row(i)(3) = "Some Street"
' f.Row(i)(4) = ""
' f.Row(i)(5) = "Some City"
' f.Row(i)(6) = "MN"
' f.Row(i)(7) = "00000"
'
' console.writeline(String.Join("", f.Row(0)))
' Next



' For i As Integer = 1 To 1000

' Dim astring(7) As String
' astring(0) = CType(Int((999999 - 1) * Rnd() + 1), String)
' astring(1) = Now.ToShortDateString
' astring(2) = "Some Customer"
' astring(3) = "Some Street"
' astring(4) = ""
' astring(5) = "Some City"
' astring(6) = "MN"
' astring(7) = "00000"

' f.Add(astring)

' Next

' f.SaveDB()

' f = Nothing

' MsgBox("Done!")



'Template File

'OrderID,2,9
'OrderDate,2,8
'CustomerName,2,35
'Address1,2,35
'Address2,2,35
'City,2,35
'State,2,2
'ZipCode,2,5

'Data File
'705547 9/8/2005 Some Customer Some Street
Some City MN00000
'533423 9/8/2005 Some Customer Some Street
Some City MN00000
'579518 9/8/2005 Some Customer Some Street
Some City MN00000
'289562 9/8/2005 Some Customer Some Street
Some City MN00000
'301948 9/8/2005 Some Customer Some Street
Some City MN00000
'774739 9/8/2005 Some Customer Some Street
Some City MN00000
'14018 9/8/2005 Some Customer Some Street
Some City MN00000
'760723 9/8/2005 Some Customer Some Street
Some City MN00000
'814489 9/8/2005 Some Customer Some Street
Some City MN00000
'709037 9/8/2005 Some Customer Some Street
Some City MN00000
'45353 9/8/2005 Some Customer Some Street
Some City MN00000
'414032 9/8/2005 Some Customer Some Street
Some City MN00000
'862618 9/8/2005 Some Customer Some Street
Some City MN00000
'790479 9/8/2005 Some Customer Some Street
Some City MN00000
'373536 9/8/2005 Some Customer Some Street
Some City MN00000
'961952 9/8/2005 Some Customer Some Street
Some City MN00000
'871445 9/8/2005 Some Customer Some Street
Some City MN00000
'56237 9/8/2005 Some Customer Some Street
Some City MN00000
'949555 9/8/2005 Some Customer Some Street
Some City MN00000
'364018 9/8/2005 Some Customer Some Street
Some City MN00000
'524868 9/8/2005 Some Customer Some Street
Some City MN00000
'767111 9/8/2005 Some Customer Some Street
Some City MN00000
'53505 9/8/2005 Some Customer Some Street
Some City MN00000
'592458 9/8/2005 Some Customer Some Street
Some City MN00000
'468700 9/8/2005 Some Customer Some Street
Some City MN00000
'298165 9/8/2005 Some Customer Some Street
Some City MN00000
'622696 9/8/2005 Some Customer Some Street
Some City MN00000
'647820 9/8/2005 Some Customer Some Street
Some City MN00000
'263793 9/8/2005 Some Customer Some Street
Some City MN00000
'279342 9/8/2005 Some Customer Some Street
Some City MN00000
'829800 9/8/2005 Some Customer Some Street
Some City MN00000
'824601 9/8/2005 Some Customer Some Street
Some City MN00000
'589162 9/8/2005 Some Customer Some Street
Some City MN00000
'986092 9/8/2005 Some Customer Some Street
Some City MN00000
'910963 9/8/2005 Some Customer Some Street
Some City MN00000
'226866 9/8/2005 Some Customer Some Street
Some City MN00000
'695115 9/8/2005 Some Customer Some Street
Some City MN00000
'980002 9/8/2005 Some Customer Some Street
Some City MN00000
'243931 9/8/2005 Some Customer Some Street
Some City MN00000
'533873 9/8/2005 Some Customer Some Street
Some City MN00000
'106370 9/8/2005 Some Customer Some Street
Some City MN00000

End Class
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top