VB.NET - Parsing URL for Variable

  • Thread starter Jozef Jarosciak
  • Start date
J

Jozef Jarosciak

Hi everyone,

I am building a web crawler and one of the features which I need to
include is exclusion of specified 'variable + value' from the url.

Example, user wanted to extract variable "s":

So when you look at this url:
"http://www.goldenretrieverforum.com/search.php?s=5817617a59fb630a7f40846e4a29efc1&do=getdaily"

, it has a variable 's' and its value, plus some other variables.

I need a code which would shorten that url to this:
"http://www.goldenretrieverforum.com/search.php?do=getdaily"
, extracting variable 's' completely.



But it needs to be smart to such point, that is variable 's' is the
last variable in the link, like this:

"http://www.goldenretrieverforum.com/search.php?s=5817617a59fb630a7f40846e4a29efc1"

, it would correctly fix it to:
"http://www.goldenretrieverforum.com/search.php"


Can someone help me write REGEX or point me to site which has such
regex written already?

Or is there any other way to do this?

Thanks a lot for your time and help.

Joe
 
G

Guest

Jozef Jarosciak said:
Hi everyone,

I am building a web crawler and one of the features which I need to
include is exclusion of specified 'variable + value' from the url.

Example, user wanted to extract variable "s":

So when you look at this url:
"http://www.goldenretrieverforum.com/search.php?s=5817617a59fb630a7f40846e4a29efc1&do=getdaily"

, it has a variable 's' and its value, plus some other variables.

I need a code which would shorten that url to this:
"http://www.goldenretrieverforum.com/search.php?do=getdaily"
, extracting variable 's' completely.



But it needs to be smart to such point, that is variable 's' is the
last variable in the link, like this:

"http://www.goldenretrieverforum.com/search.php?s=5817617a59fb630a7f40846e4a29efc1"

, it would correctly fix it to:
"http://www.goldenretrieverforum.com/search.php"


Can someone help me write REGEX or point me to site which has such
regex written already?

Or is there any other way to do this?

Thanks a lot for your time and help.

Joe
 
J

Jozef Jarosciak

I am sorry Sahuagin, but looks like you just quoted my text and there
was no reply.
joe
 
G

Guest

hmmm
apologies, i had written a reply, not sure what happened to it...

if i were you i would probably write a simple class to handle URL's, if
there isnt one like this in .NET already

Class CURLParameter
Private mName As String
Private mValue As String
End Class

Class CURL
Private mURL As String
Private mParameters As CURLParameterCollection

Public Sub New( pURL As String )
'algorithm
' search string until you find a '?'
' everything you pass goes into mURL
'<loophere>
' create a new parameter object
' search until you find an '=', everything you pass goes into
parameter.name
' search until you find an '&' or EOF, everything you pass goes into
parameter.value
' add object to collection
' if you found a '&' then goto <loophere>, else done
End Sub
End Class
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top