Jump to content



Welcome to KnowledgeSutra - Dear Guest , Please Register here to get Your own website. - Ask a Question / Express Opinion / Reply w/o Sign-Up!
- - - - -

Google And Its Caching Service


6 replies to this topic

#1 nirmaldaniel

    Privileged Member

  • Kontributors
  • PipPipPipPipPipPipPipPipPip
  • 519 posts
  • Gender:Male
  • Location:India
  • Interests:Surfing the Internet !
  • myCENT:53.01

Posted 04 April 2009 - 10:11 AM

I am sure that every person in this world who use Google service will be aware of the Google Caching system, especially those who have hosted websites must have had a detailed study about that, Google caching service is one where its so called “Google Spider” crawls the web and takes snap shots ( a kind of snap shot I would call it ) and stores it in its cache, not only stores it, but it also gives it to the public to view it, so every one must have notices it, whenever you search something in Google the search results display the links too, and near the link we can find a small word “Cached”, and when we click it we can see the cached pages of that particular website. So not only this, Google also updates these cached pages at regular intervals.

Now my question is that, how can this be legal, It also gets the snap shot of several copyrighted stuffs, isn’t it ?? So if some one has some sensitive data, it caches that too and stores it and gives it to the public. So how come caching of copyrighted data be legal?? Moreover is there anyway where one can stop or prevent Google spider entering his/her website so that the contents wont be cached. What I mean here is that if some one is hosting some sensitive data such as personnel information or so and if the concerned person doesn’t want Google to cache that particular page and store it in its cache, then what must the person do ??

#2 miladinoski

    Privileged Member

  • Kontributors
  • PipPipPipPipPipPipPipPipPip
  • 528 posts
  • Gender:Not Telling
  • myCENT:83.32

Posted 04 April 2009 - 10:32 AM

Well you said it, Google is a great resource where you can find cached pages of content that has previously been removed because of copyright infrigment or whatever else made the webmaster to remove it.

But if the webmaster is smart enough to think of this then he should dissallow any robot to cache his page, a very speedy procedure.

You just need to add a meta tag in the <head> section of your web-pages you do not wish to be cached:

<meta name="robots" content="noarchive">

That would be just about it. :)

#3 Phoenix.Illusion

    Premium Member

  • Kontributors
  • PipPipPipPipPipPipPipPip
  • 153 posts

Posted 04 April 2009 - 10:37 AM

@ Thanks miladinoski,
Are your sure thats the one?
I will try it.

- Dark

#4 nirmaldaniel

    Privileged Member

  • Kontributors
  • PipPipPipPipPipPipPipPipPip
  • 519 posts
  • Gender:Male
  • Location:India
  • Interests:Surfing the Internet !
  • myCENT:53.01

Posted 04 April 2009 - 12:03 PM

So ..miladinoskim, thats it ?? is it so simple as that ?? If so if every one follows that i guess google spider will have no place to crawl then right ?

#5 miladinoski

    Privileged Member

  • Kontributors
  • PipPipPipPipPipPipPipPipPip
  • 528 posts
  • Gender:Not Telling
  • myCENT:83.32

Posted 04 April 2009 - 12:20 PM

View Postnirmaldaniel, on Apr 4 2009, 01:03 PM, said:

So ..miladinoskim, thats it ?? is it so simple as that ?? If so if every one follows that i guess google spider will have no place to crawl then right ?
No, the Google spider and/or others like Yahoo! Slurp or MSN will crawl but they won't cache the content of your webpage. Your webpage will show up but the 'cached' link won't.

#6 zakaluka

    Advanced Member

  • Kontributors
  • PipPipPipPipPipPipPip
  • 129 posts

Posted 10 April 2009 - 07:56 AM

If you only want to stop Google's bots from caching your site, you can replace the above line with:

<meta name="googlebot" content="noarchive">

Regards,

z.

#7 nirmaldaniel

    Privileged Member

  • Kontributors
  • PipPipPipPipPipPipPipPipPip
  • 519 posts
  • Gender:Male
  • Location:India
  • Interests:Surfing the Internet !
  • myCENT:53.01

Posted 02 May 2009 - 10:54 AM

does any one have any idea when will this Google Bot crawl across ones website ? To be clear i wanna know does it do a random crawling or does it have regular intervals or does it see when the traffic is less to the site??

I Just wanna know when and all will the Bot crawl, moreover how much bandwidth does these bots take when they crawl a site ? Especially how much bandwidth is consumed by google bot when it crawls and takes that snap shot ?




Reply to this topic


This post will need approval from a moderator before this post is shown.

  


1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users