hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 罗磊 <luole...@gmail.com>
Subject idea about web page database
Date Sat, 24 Jul 2010 03:27:11 GMT

I'm trying to design a datbase which is used to store web pages for search
engine. Can you guys give me some good advice for this?

I read the page of bigtable. Google give an example of webtable, but it
makes me a little confused.  google shows how www.cnn.com is stored, but if
I have 2 pages named www.cnn.com/a.html and www.cnn.com/b.html, I don't know
weather or not to store 2 pages in on row.

Google's paper said "In Webtable, we would use URLs as row keys, various
aspects of web pages as column names, and store the contents of the web
pages in the contents", it seems google will use domain name as row key, and
store a.html and b.html as column names. But in that way, it seems
impossible for anchor design, how can users tell which page a.html or b.html
an anchor text refer to?

Luo Lei

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message