Array values split



I have an array created from an undelimited text file I receive with a
format like:


I need to split these values into array items or dataset columns in a
format like:

6019001 (code) 2003 (year) 01 02 03 04 (month #)

the end game is to produce an xml file from the dataset:

<code id="6019001">
<year yrid="2002">
<m id="1" link="6019001_012002.xml">Jan</m>
<m id="12" link="6019001_122002.xml">Dec</m>
<year yrid="2003">...

I am at a loss to know how to proceed without any delimiters and I have
no control over the format of the data I recieve.

Any help gratefully accepted




you can use somthing like:
FileStream m_streamReader
string code_id,year;
string m_p= this.readLine(this.m_streamReader);
codeid = m_p.Substring(0,7);
year= m_p.Substring(7,4);
or :
FileStream fs
byte[] codeid = new byte[7];
int i = fs.Read(codeid,offset,7);

Ignacio Machin \( .NET/ C# MVP \)


Are there always going to be 4 months at the end?

if so it;s a piece of cake:
line.Substring( line.Length - 9 ) // get the last 8 digits
line.Substring( line.Length - 13, 4 ) //year
line.SubString( 0, line.Length - 12 ) // code

of course you should test that line.Length be bigger than the values used.

also you could get a more cleaner solution, in anyway the deal is start from
the end , and of course all is based that you know FOR SURE that you have 4
months, otherwise it becomes more difficult



Hi John,

First you need to get the lines from the text file (I assume there will be
no issues in doing this).

After getting the line there are two approaches you can try:
1. Using Regex
2. Using Basic String Operations (Not sure but i guess this will be more

By first approach first you extract the Code and Year and update the string
by removing these. Now your string will contain only months so you can use
the regualr expression to get the months.

string str = "60190012003010203040506070809101112";
string code = str.Substring(0, 7);
string year = str.Substring(7, 4);
str = str.Substring(11,str.Length - 11);
string pattern = @"\d{2}";
Regex reg = new Regex(pattern);
MatchCollection mc = reg.Matches(str);
if(mc.Count > 0)
foreach(Match m in mc)
Console.WriteLine("Match Found: " + m.Value);
// Write your logic to consume month

By using string operations this can be coded like following:

string str = "60190012003010203040506070809101112";
string code = str.Substring(0, 7);
string year = str.Substring(7, 4);
if(str.Length >= 2)
for(int j=11; j < str.Length; j+=2)
Console.WriteLine("Match Found: " + str[j]+str[j+1]);

Hope it will help.

Describe your input with a regular expression:

static void Main(string[] args)
string[] lines = new string[]

Regex record = new Regex(
@"^" +
@"(?<code>\d{7})" +
@"(?<year>\d{4})" +
@"(?<months>\d\d)+" +

XmlDocument doc = new XmlDocument();

foreach (string line in lines)
Match m = record.Match(line);

if (!m.Success)
Console.Error.WriteLine("no match (" + line + ")");

string code = m.Groups["code"].ToString();
string codexpath = "/codes/code[@id = '" + code + "']";
XmlNode codeelt = doc.SelectSingleNode(codexpath);
if (codeelt == null)
XmlElement elt = doc.CreateElement("code");
elt.SetAttribute("id", code);
codeelt = elt;

string year = m.Groups["year"].ToString();
XmlElement yearelt = doc.CreateElement("year");
yearelt.SetAttribute("yrid", year);

foreach (Capture mm in m.Groups["months"].Captures)
string mmm;
int month = int.Parse(mm.ToString());

if (month >= 1 && month <= 12)
mmm = DateTimeFormatInfo.InvariantInfo.
mmm = "???";

XmlElement melt = doc.CreateElement("m");
melt.SetAttribute("id", month.ToString());
melt.SetAttribute("link", code + "_" + mm + year + ".xml");
melt.InnerText = mmm;

XmlTextWriter w = new XmlTextWriter(Console.Out);
w.Formatting = Formatting.Indented;
w.Indentation = 2;

Hope this helps,

I might have use for this technique for $work, so as an exercise, I
wrote an XmlReader for the input format the OP described. It's
certainly not a complete (or even halfway polished) implementation, but
I hope someone will find value in it.

I welcome any comments.

using System;
using System.Collections;
using System.Globalization;
using System.IO;
using System.Text.RegularExpressions;
using System.Xml;

namespace FunkyReader
public class FunkyReader : XmlReader
private NameTable nametable = new NameTable();
private ReadState state;
private Codes codes;
private ArrayList dfs;
private int node;

public FunkyReader(string[] lines)
codes = new Codes();
state = ReadState.Initial;


public ArrayList Linearization
get { return dfs; }

#region XmlReader methods

public override int AttributeCount
get { return -1; }

public override string BaseURI
get { return ""; }

public override void Close() {}

public override int Depth
get { return -1; }

public override bool EOF
get { return false; }

public override string GetAttribute(int i) { return null; }
public override string GetAttribute(string name) { return null; }
public override string GetAttribute(string name, string namespaceURI)
return null;

public override bool HasValue
get { return false; }

public override bool IsDefault
get { return false; }

public override bool IsEmptyElement
Node n = (Node) dfs[node];
Node next = (Node) dfs[node+1];

return n.NodeType == Node.Type.Start && next.NodeType == Node.Type.End;

public override string LocalName
Node n = (Node) dfs[node];

switch (n.NodeType)
case Node.Type.Start:
case Node.Type.End:
case Node.Type.Attribute:
return n.Name;
return "";

public override string LookupNamespace(string prefix) { return null; }
public override void MoveToAttribute(int i) {}
public override bool MoveToAttribute(string name) { return false; }
public override bool MoveToAttribute(string name, string ns) { return false; }
public override bool MoveToElement() { return false; }
public override bool MoveToFirstAttribute() { return false; }

public override bool MoveToNextAttribute()
Node next = (Node) dfs[node+1];

if (next.NodeType == Node.Type.Attribute)
return true;
return false;

public override string Name
get { return LocalName; }

public override string NamespaceURI
get { return ""; }

public override XmlNameTable NameTable
get { return nametable; }

public override XmlNodeType NodeType
if (node >= dfs.Count)
return XmlNodeType.None;

Node n = (Node) dfs[node];
switch (n.NodeType)
case Node.Type.Attribute:
return XmlNodeType.Attribute;
case Node.Type.Start:
return XmlNodeType.Element;
case Node.Type.End:
return XmlNodeType.EndElement;
case Node.Type.Text:
return XmlNodeType.Text;
return XmlNodeType.None;

public override string Prefix
get { return null; }

public override char QuoteChar
get { return '"'; }

public override bool Read()
if (state == ReadState.Initial)
state = ReadState.Interactive;
node = 0;

return node < dfs.Count;

public override bool ReadAttributeValue()
Node n = (Node) dfs[node];

if (n.NodeType == Node.Type.Attribute)
return true;
return false;

public override ReadState ReadState
get { return ReadState.EndOfFile; }

public override void ResolveEntity() {}

public override string this[int i]
get { return null; }

public override string this[string name, string namespaceURI]
get { return null; }

public override string this[string name]
get { return null; }

public override string Value
return ((Node) dfs[node]).Value;

public override string XmlLang
get { return null; }

public override XmlSpace XmlSpace
get { return XmlSpace.None; }


#region parse input

private void ParseLines(string[] lines)
Regex record = new Regex(
@"^" +
@"(?<code>\d{7})" +
@"(?<year>\d{4})" +
@"(?<months>\d\d)+" +

foreach (string line in lines)
Match m = record.Match(line);

string code = m.Groups["code"].ToString();
string year = m.Groups["year"].ToString();
foreach (Capture mm in m.Groups["months"].Captures)
AddMonth(code, year, mm.ToString());

dfs = new ArrayList();
dfs.Add(new Node("codes", Node.Type.Start));
foreach (Code c in codes)
dfs.Add(new Node("codes", Node.Type.End));

private void AddMonth(string code, string year, string mm)
[year].Add(new Month(code, year, mm));


#region Node class

public class Node
public enum Type { Start, Attribute, Text, End };

private string name;
private Type type;

public Node(string name, Type type)
{ = name;
this.type = type;

public string Name
get { return name; }

public string Value
get { return name; }

public Type NodeType
get { return type; }


#region various element representations

class Codes
private ArrayList codes = new ArrayList();

public Code this[string code]
Code c = null;
for (int i = 0; i < codes.Count; i++)
if (((Code) codes[i]).ID == code)
c = (Code) codes[i];

if (c != null)
return c;
codes.Add(c = new Code(code));
return c;

public IEnumerator GetEnumerator()
return codes.GetEnumerator();

class Code
string id;
private ArrayList years = new ArrayList();

public Code(string name)
id = name;

public Year this[string year]
Year y = null;
for (int i = 0; i < years.Count; i++)
if (((Year) years[i]).ID == year)
y = (Year) years[i];

if (y != null)
return y;
years.Add(y = new Year(year));
return y;

public string ID
get { return id; }

public void Linearize(ArrayList record)
record.Add(new Node("code", Node.Type.Start));
record.Add(new Node("id", Node.Type.Attribute));
record.Add(new Node(ID, Node.Type.Text));
foreach (Year y in years)
record.Add(new Node("code", Node.Type.End));

class Year
string yrid;
ArrayList months = new ArrayList();

public Year(string year)
yrid = year;

public string ID
get { return yrid; }

public void Add(Month m)

public void Linearize(ArrayList record)
record.Add(new Node("year", Node.Type.Start));
record.Add(new Node("yrid", Node.Type.Attribute));
record.Add(new Node(yrid, Node.Type.Text));
foreach (Month m in months)
record.Add(new Node("year", Node.Type.End));

class Month
private int month;  // i.e., 1-12
private string link;

public Month(string code, string year, string mm)
this.month = int.Parse(mm);  = code + "_" + mm + year + ".xml";

public int ID
get { return month; }

public string Link
get { return link; }

public string MonthShortName
return DateTimeFormatInfo.InvariantInfo.MonthNames[month-1].Substring(0,3);

public void Linearize(ArrayList record)
record.Add(new Node("m", Node.Type.Start));
record.Add(new Node("id", Node.Type.Attribute));
record.Add(new Node(month.ToString(), Node.Type.Text));
record.Add(new Node("link", Node.Type.Attribute));
record.Add(new Node(link, Node.Type.Text));
record.Add(new Node(MonthShortName, Node.Type.Text));
record.Add(new Node("m", Node.Type.End));



Well, not complete, but I *can* load the OP's input lines into an
XML document and get the expected output:

string[] lines =

FunkyReader r = new FunkyReader(lines);

XmlDocument xml = new XmlDocument();

XmlTextWriter w = new XmlTextWriter(Console.Out);
w.Formatting = Formatting.Indented;
w.Indentation = 2;



