loading tab-delimited files...

G

gloria

Hey kids,

I need to work with a tab-delimited file and load and validate the data
before inserting it into various tables. Is there a way I can load it
directly a temp table or load it into a DataSet? This would be much
easier than building a two-dimensional array. Any thoughts on how to
do this better?

Thanks.

--gloria
 
E

Elton Wang

Hi,

There is simple way to quickly insert file to database
table:

BULK INSERT Northwind.dbo.[Order Details]
FROM 'f:\orders\lineitem.tbl'
WITH (FIELDTERMINATOR = '\t', ROWTERMINATOR = '\n')

HTH

Elton Wang
(e-mail address removed)
 
S

Sahil Malik

Mommy,

Bulk insert is an option, but that will insert first, ask questions later
i.e. I think you need to do validation first?
What kind of validation are you looking to do? Seriously, you can very
easily do System.IO.FileStream for something like this.

- Sahil Malik
http://codebetter.com/blogs/sahil.malik/
 
P

Paul Clement

¤ Hey kids,
¤
¤ I need to work with a tab-delimited file and load and validate the data
¤ before inserting it into various tables. Is there a way I can load it
¤ directly a temp table or load it into a DataSet? This would be much
¤ easier than building a two-dimensional array. Any thoughts on how to
¤ do this better?

Yes, you can load the data into a DataSet (DataTable):

Dim ConnectionString As String

ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & "E:\My Documents\TextFiles" & ";" & _
"Extended Properties=""Text;HDR=NO;"""

Dim TextConnection As New System.Data.OleDb.OleDbConnection(ConnectionString)
TextConnection.Open()

Dim da As New System.Data.OleDb.OleDbDataAdapter("SELECT * FROM TabDelimitedFile.txt",
TextConnection)

Dim ds As New DataSet("TextFiles")
da.Fill(ds, "TabDelimited")

TextConnection.Close()


The only other thing you need is a schema.ini file (located in the same folder as your text files)
to indicate that your text files are tab delimited. An example of the contents might look like the
following:

[TabDelimitedFile.txt]
ColNameHeader=False
Format=TabDelimited
CharacterSet=ANSI


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 
G

gloria

Paul, I think this is great. However, I'm not sure on the rules
governing the connect string. I have a SQL Server DB and the path, but
I'm not sure how to set up the connect string. Since it's not actually
on the DB, I don't need a user/passwd, but I'm not sure what to put for
the provider.

Could you, or someone else, please give me some advice on this one?

Thanks!

--gloria
 
P

Paul Clement

¤ Paul, I think this is great. However, I'm not sure on the rules
¤ governing the connect string. I have a SQL Server DB and the path, but
¤ I'm not sure how to set up the connect string. Since it's not actually
¤ on the DB, I don't need a user/passwd, but I'm not sure what to put for
¤ the provider.
¤
¤ Could you, or someone else, please give me some advice on this one?
¤

Not sure if I understand your question as it relates to SQL Server. Are you trying to export the
text file to SQL Server?

You can find connection string examples at the below link:

http://www.connectionstrings.com


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 
G

gloria

Paul,

I thought the issue was my connect string. Upon further investigation,
I don't think it's that. I can connect to the file, but my select
pulls 0 rows. I thought the select_string would just be:
select * from <filename>
Your thoughts would be most appreciated.

--gloria

-----------------------------
filename = "861D_NI_03082005001.txt";
string dirname = @"\\box\sdc\";
string fullfilename = dirname + filename;

if (File.Exists(fullfilename) )
porerrorClass.WriteLine(3,CurrentFunction,"File exists.");


string connectionStr = "Provider=MediaCatalogMergedDB OLE DB
Provider;Persist Security Info=False;User ID=\"\";Data Source=" +
dirname + ";Location=" + filename + ";Mode=Read";

string loaddtlStr = "select * from " + filename;
porerrorClass.WriteLine(3,CurrentFunction,loaddtlStr);

try
{
OleDbConnection loaddtlCN = new OleDbConnection(connectionStr);
loaddtlCN.Open();
porerrorClass.WriteLine(3,CurrentFunction,"Connection is open.");

OleDbDataAdapter loaddtlDA = new
OleDbDataAdapter(loaddtlStr,loaddtlCN);
DataSet loaddtlDS = new DataSet("LoadedDtl");
porerrorClass.WriteLine(3,CurrentFunction,"Filling DataSet...");
loaddtlDA.Fill(loaddtlDS);
porerrorClass.WriteLine(3,CurrentFunction,"DataSet Filled.");

int tblcnt = loaddtlDS.Tables.Count;
porerrorClass.WriteLine(3,CurrentFunction,"Tables loaded: " + tblcnt);

loaddtlCN.Close();
}
catch (Exception ex)
{
porerrorClass.WriteLine(3,CurrentFunction,"Couldn't connect: " +
ex.Message);
}
 
P

Paul Clement

¤ Paul,
¤
¤ I thought the issue was my connect string. Upon further investigation,
¤ I don't think it's that. I can connect to the file, but my select
¤ pulls 0 rows. I thought the select_string would just be:
¤ select * from <filename>
¤ Your thoughts would be most appreciated.
¤
¤ --gloria
¤
¤ -----------------------------
¤ filename = "861D_NI_03082005001.txt";
¤ string dirname = @"\\box\sdc\";
¤ string fullfilename = dirname + filename;
¤
¤ if (File.Exists(fullfilename) )
¤ porerrorClass.WriteLine(3,CurrentFunction,"File exists.");
¤
¤
¤ string connectionStr = "Provider=MediaCatalogMergedDB OLE DB
¤ Provider;Persist Security Info=False;User ID=\"\";Data Source=" +
¤ dirname + ";Location=" + filename + ";Mode=Read";
¤
¤ string loaddtlStr = "select * from " + filename;
¤ porerrorClass.WriteLine(3,CurrentFunction,loaddtlStr);
¤
¤ try
¤ {
¤ OleDbConnection loaddtlCN = new OleDbConnection(connectionStr);
¤ loaddtlCN.Open();
¤ porerrorClass.WriteLine(3,CurrentFunction,"Connection is open.");
¤
¤ OleDbDataAdapter loaddtlDA = new
¤ OleDbDataAdapter(loaddtlStr,loaddtlCN);
¤ DataSet loaddtlDS = new DataSet("LoadedDtl");
¤ porerrorClass.WriteLine(3,CurrentFunction,"Filling DataSet...");
¤ loaddtlDA.Fill(loaddtlDS);
¤ porerrorClass.WriteLine(3,CurrentFunction,"DataSet Filled.");
¤
¤ int tblcnt = loaddtlDS.Tables.Count;
¤ porerrorClass.WriteLine(3,CurrentFunction,"Tables loaded: " + tblcnt);
¤
¤ loaddtlCN.Close();
¤ }
¤ catch (Exception ex)
¤ {
¤ porerrorClass.WriteLine(3,CurrentFunction,"Couldn't connect: " +
¤ ex.Message);
¤ }

I'm not the slightest bit familiar with the provider being used in your
connection string. Since you're opening a text file, did you try the connection
string code I posted several days ago?


Paul
~~~~
Microsoft MVP (Visual Basic)
 
G

gloria

Yes, but I but it didn't work.
f.udl: Provider=Microsoft.Jet.OLEDB.4.0;Data
Source=C:\861D_NI_03082005001.txt;Persist Security Info=False
error msg: Test connection failed because of an error initializing
provider. Unregognized databse format 'C:\861D_NI_03082005001.txt'.

Sample data from file (tab-delimited):
2002152 1000 777158-01 gloria test 1 1.1 3/7/2005 18:15
2002150 1000 776572-24 gloria test 4 1.1 3/8/2005 09:53
2002150 1000 777145-01 gloria test 30 2.1 3/8/2005 10:41
2002147 1000 778274-01 gloria test 20 1.1 3/8/2005 10:42


Instead, I went through and looked at every provider which might allow
me to specify a file and attempted to connect to it. Treating things
like a black box is not necessarily the best idea. (Inherently a bad
idea to use things without understanding how they work.)

Suggestions? Thoughts? I really appreciate you helping me to work
through this.

--gloria
 
P

Paul Clement

¤ Yes, but I but it didn't work.
¤ f.udl: Provider=Microsoft.Jet.OLEDB.4.0;Data
¤ Source=C:\861D_NI_03082005001.txt;Persist Security Info=False
¤ error msg: Test connection failed because of an error initializing
¤ provider. Unregognized databse format 'C:\861D_NI_03082005001.txt'.
¤
¤ Sample data from file (tab-delimited):
¤ 2002152 1000 777158-01 gloria test 1 1.1 3/7/2005 18:15
¤ 2002150 1000 776572-24 gloria test 4 1.1 3/8/2005 09:53
¤ 2002150 1000 777145-01 gloria test 30 2.1 3/8/2005 10:41
¤ 2002147 1000 778274-01 gloria test 20 1.1 3/8/2005 10:42
¤
¤
¤ Instead, I went through and looked at every provider which might allow
¤ me to specify a file and attempted to connect to it. Treating things
¤ like a black box is not necessarily the best idea. (Inherently a bad
¤ idea to use things without understanding how they work.)
¤
¤ Suggestions? Thoughts? I really appreciate you helping me to work
¤ through this.

The filename should not be specified in the Data Source. You should only specify the path.

You need to specify the Jet provider, the path to the text file and the extended properties in the
connection string:

ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & "C:\" & ";" & _
"Extended Properties=""Text;HDR=NO;"""

The filename is specified in the SQL statement as in the example I posted.

Dim da As New System.Data.OleDb.OleDbDataAdapter("SELECT * FROM 861D_NI_03082005001.txt",
TextConnection)

And don't forget that tab delimited text files require an entry in a schema.ini file:

[861D_NI_03082005001.txt]
ColNameHeader=False
Format=TabDelimited
CharacterSet=ANSI


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 
G

gloria

More investigation shows that my problem with this is that I don't have
Microsoft.Jet.OLEDB.40 as a provider. Jet comes with Access which I
don't have. I will track down a license for this and try again. I
will let you know how this ends up working out.

Thanks for all your help on this!

--gloria
 
P

Paul Clement

¤ More investigation shows that my problem with this is that I don't have
¤ Microsoft.Jet.OLEDB.40 as a provider. Jet comes with Access which I
¤ don't have. I will track down a license for this and try again. I
¤ will let you know how this ends up working out.
¤

The Jet components can be downloaded from Microsoft:

http://support.microsoft.com/default.aspx?scid=kb;en-us;239114


Paul
~~~~
Microsoft MVP (Visual Basic)
 
G

gloria

Hey kids,

I found a cleaner way to load my data from a flatfile into a
datastructure for processing. I will then use this data structure to
validate and then process the data before inserting it into the table.

Thanks for all your help, everyone, especially Paul who really followed
up on this.

Posting code I used in case future searchers need it. :)

--gloria

-------------------------------------
#region Loads the PO Receipt data into OLiTE (861 Processing)
//--------------------------------------------------------------
public void LoadPOReceipt()
{
string CurrentFunction = "LoadPOReceipt";
porerrorClass.WriteLine(2,CurrentFunction,"Entering.");

//string filename = FindTheFile();

//string filename = porerrorClass.BAXdirectoryStr +
this.BAX861DfilenameStr;
//string filename = porerrorClass.BAXdirectoryStr +
"861D_NI_03082005001.txt";
string filename = porerrorClass.BAXdirectoryStr + "861D_test2.txt";

ArrayList poreceiptAL = Load861Data(filename);

ProcessThePOReceipt(poreceiptAL);

//CloseAnyOpenPOs();

porerrorClass.WriteLine(2,CurrentFunction,"Exiting.");
return;
}
#endregion


#region Class to read in the 861 Data.
//---------------------------------------------
private class DataClass861
{
public string ponumStr;
public string subinvStr;
public string partnumStr;
public string receivedqtyStr;
public string linenumStr;
public string receiveddateStr;
public string poitemidStr;
public string waybillnumStr;

public int ReadIn(StreamReader sr)
{

char [] delim = "\t".ToCharArray();
string currentlineStr = sr.ReadLine();
string[] dataArray = currentlineStr.Split(delim);
int rowcnt = dataArray.Length;

#region
if (rowcnt == 33)
{
char[] spacedelim = " ".ToCharArray();
string tempvar;
bool numbercheck;
numbercheck = false;

ponumStr = dataArray[0].Trim(spacedelim);
subinvStr = dataArray[1].Trim(spacedelim);
partnumStr = dataArray[4].Trim(spacedelim);
receivedqtyStr = dataArray[8].Trim(spacedelim);
linenumStr = dataArray[9].Trim(spacedelim);
receiveddateStr = dataArray[10].Trim(spacedelim);
poitemidStr = dataArray[11].Trim(spacedelim);
waybillnumStr = dataArray[32].Trim(spacedelim);

}
else
{
readinEC.RecordError(CurrentFunction,CurrentProcess,"Data",
"Incorrect number of rows in 861 Data.","",
"Rowcount should be 33, but it was " + rowcnt + ".");
readinEC.WriteLine(2,CurrentFunction,"Exiting");
return -1;
}
#endregion


readinEC.WriteLine(2,CurrentFunction,"Exiting");
return 0;
}

};
#endregion


#region Actually load the data into the dataset.
//--------------------------------------------------------------------
//public ArrayList Load861Data(string filenameStr)
public void Load861Data(string filenameStr)
{

string CurrentFunction = "Load861Data";
porerrorClass.WriteLine(2,CurrentFunction,"Entering.");
ArrayList load861dataAL = new ArrayList();

using(StreamReader sr = new StreamReader(filenameStr) )
{
while(sr.Peek() >=0)
{
DataClass861 line = new DataClass861();
if (line.ReadIn(sr) == 0)
load861dataAL.Add(line);
}


int rowcnt = load861dataAL.Count;
porerrorClass.WriteLine(1,CurrentFunction,"Number of rows: " +
rowcnt);

porerrorClass.WriteLine(2,CurrentFunction,"Exiting.");
return load861dataAL;
}


}
#endregion
 
G

gloria

Paul,

I really appreciate all your help on this. Very kind to follow through
like this.

--gloria
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top