Pattern Recognition

  • Thread starter Thread starter Aaron
  • Start date Start date
A

Aaron

This there a public program/algorithm that can tell me the key points of a
text?

For example I entered the following text:

Web logs, or blogs, the online personal diaries where big names and no names
expound on everything from pets to presidents, are going mainstream. While
still a relatively small piece of total online activity, blogging has caught
on with affluent young adults. As Forrester Research analysts recently
noted, blogging will become increasingly common as these consumers age.

The program should give me the main keywords such as: blog, online,people...

I know some spam filters does this and Google??
I don't need it to be super accurate, but just to demostrate that this is
possible.

Any help is greatly appreciated,
Aaron
 
key points

If the program were to process this post, it would output keywords like:
programming, internet...
 
Hello Aaron,
This there a public program/algorithm that can tell me the key points of a
text?

For example I entered the following text:

Web logs, or blogs, the online personal diaries where big names and no
names expound on everything from pets to presidents, are going mainstream.
While still a relatively small piece of total online activity, blogging
has caught on with affluent young adults. As Forrester Research analysts
recently noted, blogging will become increasingly common as these
consumers age.

The program should give me the main keywords such as: blog,
online,people...

Fascinating. Simply keyword counting produces nearly nothing. The only
words that occur more than once are "blogging" and "names." The word
"people" that you produce in your list of keywords doesn't occur in the
paragraph at all.

You would need an algorithm that creates a contextual map through a lexical
tree and produces, effectively, an "understanding" of the key concept of the
paragraph. Effectively, you are entering the field of Computational
Linguistics.

There is some fascinating research on Natural Language Processing that began
in the late 80s (and continues today) that addresses many of these ideas.
I'm sure that some of the current "search" research has raised interest
further. Microsoft Research, IBM Research, and others are very much
interested in these areas.

One example would be the Text Mining project at IBM:
http://www.trl.ibm.com/projects/textmining/index_e.htm

A good link for coding systems that follow some of these practices is here:
http://www.cl.cam.ac.uk/Research/NL/anlt.html

There is WAY too much involved, morphologically, lexically, and
linguistically, to demonstrate even the simplest of these algorithms in a
newsgroup message. Start at your local college library and/or Google for
"Natural Language Processing" Go from there.

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
 
Nick ,thanks for the info, I'll keep looking and let you guys know.

The best example I can think of right now is Google's Adsense. It does a
very good job analyzing the webpage and pin pointing the central meaning.
Any idea what method they use?
 
Aaron said:
This there a public program/algorithm that can tell me the key points of a
text?

For example I entered the following text:

Web logs, or blogs, the online personal diaries where big names and no names
expound on everything from pets to presidents, are going mainstream. While
still a relatively small piece of total online activity, blogging has caught
on with affluent young adults. As Forrester Research analysts recently
noted, blogging will become increasingly common as these consumers age.

The program should give me the main keywords such as: blog, online,people...

I know some spam filters does this and Google??
I don't need it to be super accurate, but just to demostrate that this is
possible.


It sounds like you're interested in text mining. Try KDnuggets
(http://www.kdnuggets.com) in the "Software" section.

-Will Dwinnell
http://will.dwinnell.com
 
Back
Top