ArrayList help needed

B

Bernie Walker

I am retriving a list of headers from a news server. I create an array that
contains the elements of the news header and add it to an arraylist.
After I have collected all the headers and added them to the arraylist, I
sort the arraylist on the 'article subject' element of the arrays. This all
works fine - so far.
Now I want to take this sorted array list and combine all the headers from
multipart posts into a single entry in a new arraylist.

SA represents the sorted arrays; SA[0] is 'article subject', SA[1] is
'posted by', SA[2] is 'date posted', SA[3] is 'bytes posted', SA[4] is
'lines posted', SA[5] is 'article id'

In theory, all the 'article subjects' will be identical in a multipart post
with the exception of (x/y) appended to the end of the subject indicating
what segment it is out of the total number of segments. I have split the
'article subject' using Regex(@"(\(\d+/\d+\)$)") this gives me a single
element if there is no (x/y) at the end of the 'article subject' and three
elements if there is (the last being "" for some odd reason).

Is it better to use another Regex to get the numbers contained in the
"(x/y)" or String.Split?

I am guessing I will have to do something like - for (int irow=0;
irow<arraylist.length; irow++) to walk through the sorted arraylist

here is where I am stuck... how would it be best to compare the split
article subject to the previous (or next?) article subject and when they are
the same total the 'article bytes', total the 'article lines' and add the
'article id' to an array as element 'x' out of 'y' elements (from the
"(x/y)"). When the next split article subject does not match, add the
collected info to a new arraylist. If there is no multipart info just add
the existing info to the new arraylist.

I know this should be pretty straightfoward but I am just drawing a blank...

Bernie.
 
U

Uri Dor

Bernie said:
I am retriving a list of headers from a news server. I create an array that
contains the elements of the news header and add it to an arraylist.
After I have collected all the headers and added them to the arraylist, I
sort the arraylist on the 'article subject' element of the arrays. This all
works fine - so far.
Now I want to take this sorted array list and combine all the headers from
multipart posts into a single entry in a new arraylist.

SA represents the sorted arrays; SA[0] is 'article subject', SA[1] is
'posted by', SA[2] is 'date posted', SA[3] is 'bytes posted', SA[4] is
'lines posted', SA[5] is 'article id'

In theory, all the 'article subjects' will be identical in a multipart post
with the exception of (x/y) appended to the end of the subject indicating
what segment it is out of the total number of segments. I have split the
'article subject' using Regex(@"(\(\d+/\d+\)$)") this gives me a single
element if there is no (x/y) at the end of the 'article subject' and three
elements if there is (the last being "" for some odd reason).

Is it better to use another Regex to get the numbers contained in the
"(x/y)" or String.Split?

I think one Regex and the Match method (use parenthesis for grouping) is
more suitable for this than two Splits.
I am guessing I will have to do something like - for (int irow=0;
irow<arraylist.length; irow++) to walk through the sorted arraylist

here is where I am stuck... how would it be best to compare the split
article subject to the previous (or next?) article subject and when they are
the same total the 'article bytes', total the 'article lines' and add the
'article id' to an array as element 'x' out of 'y' elements (from the
"(x/y)"). When the next split article subject does not match, add the
collected info to a new arraylist. If there is no multipart info just add
the existing info to the new arraylist.

I know this should be pretty straightfoward but I am just drawing a blank...

Bernie.

It'll probably be something like:
current_multipart = 0;
current_post = 0;
Regex r = ...;

while(current_post<sa.Length)
{
while(r.match(sa[current_post]).groups("plain_subject") ==
r.match(sa[current_multipart]).groups("plain_subject"))
{
// accumulate whatever
current_post++;
}
current_multipart = current_post;
}

HTH
 
B

Bernie Walker

I am using your Regex match suggestion.
I ended up using a couple of stringbuilders to accumulate the info and then
splitting them into new arrays.
sorting the original arraylist, breaking it apart, combining, and displaying
the results for a couple hundred thousand items happens surprisingly
quickly.

Thanks for the help.

Bernie.

Uri Dor said:
Bernie said:
I am retriving a list of headers from a news server. I create an array that
contains the elements of the news header and add it to an arraylist.
After I have collected all the headers and added them to the arraylist, I
sort the arraylist on the 'article subject' element of the arrays. This all
works fine - so far.
Now I want to take this sorted array list and combine all the headers from
multipart posts into a single entry in a new arraylist.

SA represents the sorted arrays; SA[0] is 'article subject', SA[1] is
'posted by', SA[2] is 'date posted', SA[3] is 'bytes posted', SA[4] is
'lines posted', SA[5] is 'article id'

In theory, all the 'article subjects' will be identical in a multipart post
with the exception of (x/y) appended to the end of the subject indicating
what segment it is out of the total number of segments. I have split the
'article subject' using Regex(@"(\(\d+/\d+\)$)") this gives me a single
element if there is no (x/y) at the end of the 'article subject' and three
elements if there is (the last being "" for some odd reason).

Is it better to use another Regex to get the numbers contained in the
"(x/y)" or String.Split?

I think one Regex and the Match method (use parenthesis for grouping) is
more suitable for this than two Splits.
I am guessing I will have to do something like - for (int irow=0;
irow<arraylist.length; irow++) to walk through the sorted arraylist

here is where I am stuck... how would it be best to compare the split
article subject to the previous (or next?) article subject and when they are
the same total the 'article bytes', total the 'article lines' and add the
'article id' to an array as element 'x' out of 'y' elements (from the
"(x/y)"). When the next split article subject does not match, add the
collected info to a new arraylist. If there is no multipart info just add
the existing info to the new arraylist.

I know this should be pretty straightfoward but I am just drawing a blank...

Bernie.

It'll probably be something like:
current_multipart = 0;
current_post = 0;
Regex r = ...;

while(current_post<sa.Length)
{
while(r.match(sa[current_post]).groups("plain_subject") ==
r.match(sa[current_multipart]).groups("plain_subject"))
{
// accumulate whatever
current_post++;
}
current_multipart = current_post;
}

HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top