k^infinity to http://kpowerinfinity.spaces.live.com/ & http://kpowerinfinity.wordpress.com

Pushing the limits ... to infinity! This blog has now been split into two. My personal blog is now located at Live Spaces and my more technical blog is located at Wordpress

Thursday, January 06, 2005

A Scholar

You must be wondering why I am writing about Google Scholar, now that all relevant people possible know about the service. The reason is that I met the architect of Google scholar today, Anurag Acharya, and an IIT Kharagpur alumnus. He was in the institute and agreed to deliver a talk to the students.

He first introduced us to Google search and the technology behind it, the challenges in various stages such as URL redirection, dynamic pages, data replication and load balancing during crawling, the subsequent generation of indices, the serving of the query by thousands of computers connected in a cluster in strategically located data centres all over the world. He rightfully claimed that it was perhaps the largest distributed data system in operation.

He went on to demonstrate the usefulness of Goolge Scholar, which is mainly aimed at researchers. Acharya describes it as payback to the universities from which Google initially emerged. He also anticipates another class of users who might be seeking expert knowledge in fields such as medicine.

He also elaborated on the difficulties faced during the creation of Google Scholar, convincing publishers to part with data about research papers, crawling them and mining useful information out of them, indexing them on the basis of relevance and the number of citations of each paper, and subsequently using google's search system to display results. There were several challenges involved since Scholar has to throw up very specific results, and there should not be any duplication of papers on the results page. The challenges involved identifying author, keywords, citations etc. on each page, which is no mean task since scientists tend to be sloppy at times and also because the same paper may be published in many places.

However, I really admired the magnificient set up Google has created. It is based on normal personal computers running on Linux, and is extensible, fault-tolerant, and really really fast. It is amazing to even come to know the number of processes which run in parallel behind the scenes. A very inspiring lecture.

---

I broke my resolution today because of the lecture.

Current Mood: Inspired
Current Music: Hum hain iss pal yahan, from Kisna

Visit Google scholar at http://scholar.google.com

7 Comments:

  • At 9:55 pm, Anonymous Anonymous said…

    *lol*
    Nice resolution Kpower. I might make something like that next year. 

    Posted by suneet

     
  • At 10:39 am, Anonymous Anonymous said…

    Hi KK-

    A nice post.

    Thanks
    Chirayu
     

    Posted by Chirayu

     
  • At 10:44 am, Blogger Chirayu said…

    Hi KK-

    An informative post.

    Did you have an interactive Q&A session after the lecture?

    Thanks,
    Chirayu

     
  • At 11:37 am, Blogger Pallavi said…

    I think it will be very helpful.. and the whole idea is very interesting.. a search engine is very useful if it is well organized...

     
  • At 1:16 pm, Anonymous Anonymous said…

    2Suneet: lolz !

    2chirayu: yes we did get a chance to ask questions. Most were about how google maintains its data. sadly he refused to answer any questions related to SEO.

    2Pallavi: yes ! google has changed our world :) 

    Posted by kpowerinfinity

     
  • At 8:34 pm, Anonymous Anonymous said…

    Borke it already???yaar abhi tho sirf 4 or 5 days since new yr and y already broke the resolution:) 

    Posted by neelima

     
  • At 1:00 am, Anonymous Anonymous said…

    Google scholar is a God-send!! If you meet Mr Acharya again, please convey a huge thank you to him... from the entire engg student body!! :D 

    Posted by Diamonds

     

Post a Comment

<< Home