14 million lexicons

  • Thread starter Thread starter ryu
  • Start date Start date
R

ryu

Hi,

I am just curious. In the paper by the google founders, they said they are
able to load 14 million lexicons into 256mb of memory. How did they do that?
Is there anyway it can be done using dotnet or C++?

Regards
Ryu
 
Hi

People used to be able to get a whole game, a textprocessor and an
spreadsheet on 1 floppy...

Do the math, you will see you still have more than 19 bytes available for
each word.

But, indeed, I would also like to know how memory alignment is done in .Net
with classes. Anybody knows about a (correct) indept article about the
internal memory alignment?

kind regards

Alexander
 
256MB = 256,000,000 bits (apprx)
14 million = 14,000,000.

Divide the two - apprx 18 bits per entry.

If individual entry information is less than 18 bits per entry, then it's
possible, or else it's not. (Were they simply storing pointers to
information?).

Yes it's possible to do the above in C++. The real question is - why would
you want to? Memory is cheap !!.

- Sahil Malik
http://dotnetjunkies.com/weblog/sahilmalik
 
Sahil Malik said:
256MB = 256,000,000 bits (apprx)

Nope - 256MB is 256,000,000 *bytes* - so you get about 18 *bytes* per
entry.

18 *bits* per entry would have been far harder to do.
 
How am I able to put 14 million terms into 256 mb? I am only able to put 2
million into approximately 120 MB , and I am using a hashtable. What can i
use besides hash table?
 
Jon said:
Nope - 256MB is 256,000,000 *bytes* - so you get about 18 *bytes* per
entry.

Correct me if I'm wrong, but 256MB is 256*1024*1024 bytes ;)
That is 268,435,456 (few more words fits in this way) :)
 
Back
Top