dynamically find keywords.

K

kakriarohit

Hi,

Is there any way to find the keyword of an articles. I need to find related articles of a article directory.

e.g. I have xpode.com I need to get related articles of an article. Is there any algorithm or way which can find related articles. I searched google but It did not show the results expected.

as i think a bit of artificial int. needs in this concept.

Thanks,
Rohit
 
J

jack

Thanks Pete,

I agree with this, as I thought, a logic which can show some keywords
of article. I am working on it.. If found the solution, will share in
group.

Regards,
Rohi
 
B

Brian Cryer

Hi,

Is there any way to find the keyword of an articles. I need to find
related articles of a article directory.

e.g. I have xpode.com I need to get related articles of an article. Is
there any algorithm or way which can find related articles. I searched
google but It did not show the results expected.

as i think a bit of artificial int. needs in this concept.

It's actually very easy to extract a list of keywords (and key-phrases) from
an article. The gist of it is that you parse the article stripping out a
word at a time.

If you are looking for key-phrases then it becomes a little more comple but
not much, the main things to consider are how to determine when you've
reached the end of a phrase (a comma or full stop are examples) and what is
the maximum length of key-phrases that you want to allow for.

Other things to throw into the mix:

* Dead words. These are words which you know are never going to be
significant. This list might depend on your purpose for wanting keywords,
but the following are words in my dead word list:

"after all also am an and any are aren as at be" +
" because been before between big blah both but by can come
did" +
" do does each etc far few fix for from get go got had has"
+
" have he her hers hid him his how however ie if in into is"
+
" it its just led less let many may me more most much must"
+
" my near new next no none nor not now of off oh old on
only" +
" onto or other our out over per put same say she since so
some" +
" such than that the their them then there these they this"
+
" those til to too try under unto up upon us very vs was" +
" way we were what when where whether which who why will
with" +
" within without yes yet you your"

* Keyword frequency. This gives you an idea of which keywords/keyphrases are
significant in an article.

* Keyword location. Keywords in headings or perhaps nearer the top might be
more significant.

* Synonyms.

* Numbers.

* URLs.

Of course once you've got your list of keywords - and it shouldn't take you
many hours to knock up a working parser - you then have the issue of how you
want to use and store that information, but that's going beyond your
original question.

Sorry I can't post my code - its in vb.net (not yet got round to porting it
to C#) and its built upon some other libraries which means I can't easily
extract just the necessary bits. But I hope the above sets you on the right
track.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top