Yahoo?

Please post any questions, comments, concerns, issues, bugs, or feature requests about the site/forums/server here.
Post Reply
Chris
Posts: 13515
Joined: Sat Jan 06, 2001 12:00 am
Location: Northern AB, CA, turn left Alaska, Turn right, Yukon Territoies

Yahoo?

Post by Chris »

Hi Philip
I was curious, how much of an impact on bandwidth and the server due to data base hits does the Yahoo! Slurp Spider have on SG.
With around 350 Yahoo! Slurp Spider " Guests " at most times.
Does this increased spider traffic affect your cost for bandwidth or need to upgrade the server in order to deal with it?
User avatar
SpareX
Advanced Member
Posts: 642
Joined: Sun May 27, 2001 1:19 pm
Location: Its chilly here...

Post by SpareX »

Holy moses.... Yahoo is just flooding SG....

Other Spiders found..
1 MSNbot
4 Google Spiders
1 Google AdSense Spider

so out of the current 400ish guests.. 385 are yahoo currently.. (give or take 10 or so depending on current normal guests)
[CENTER]Beer, Pretzels, and a Monkey with a shotgun. Dare I ask for more?[/CENTER]

SYSTEM A :thumb: / system b :sleep:
INTEL P4 3.06Ghz / intel p4 2.53Ghz
INTEL D865PERL / intel d845pebt2
1GB DDR3200 OCZ RAM / 512MB 2700 ram
ATI RADEON 9800 PRO 128/ pny ti4400
SEAGATE SATA 7200 80Gb / maxtor 7200 100Gb
SB AUDIGY GAMER/ onboard sound
XP Pro on both :thumb:
User avatar
Philip
SG VIP
Posts: 11761
Joined: Sat May 08, 1999 5:00 am
Location: Jacksonville, Florida

Post by Philip »

Since the beginning of the year, hits by spiders account for 25-30% of all the page requests throughout the site. At the same time, they only account for ~5-6% of the bandwidth, since they are text-only and do not load any images. There is some impact on the server for sure, it's just a guestimate but I'd say in the neighborhood of 25% additional load to query the DB, process pages, gzip, etc.

There are ways to limit the speed at which they crawl the site, with robots.txt for example (you can set a limit to maximum hits per time period, for exampple 1 hit per 3 seconds), mod rewrite, etc. However, limiting them simply slows their indexing of the site, which in turn can hurt search engine rankings.

It is a necessary evil I suppose :)
Disclaimer: Please use caution when opening messages, my grasp on reality may have shaken loose during transmission (going on rusty memory circuits), even though my tin foil hat is regularly audited for potential supply chain tampering. I also eat whatever crayons are put in front of me.
๑۩۞۩๑
Chris
Posts: 13515
Joined: Sat Jan 06, 2001 12:00 am
Location: Northern AB, CA, turn left Alaska, Turn right, Yukon Territoies

Post by Chris »

Philip wrote:
It is a necessary evil I suppose :)
But from what I've read the only ones who will benefit from this are companies that are willing to pay Yahoo for prefered status,
I don't know how true that is, but it makes you wonder if you don't pay will you even get listed.
Right now the service is free, but I read that in a couple of months they will moove to the payed inclusion.
Seems like an awefull lot of additional server load and bandwidth for little to no payback, unless you pay Yahoo.
User avatar
YARDofSTUF
Posts: 70006
Joined: Sat Nov 11, 2000 12:00 am
Location: USA

Post by YARDofSTUF »

I think payed sites jsut get listed first.
User avatar
Philip
SG VIP
Posts: 11761
Joined: Sat May 08, 1999 5:00 am
Location: Jacksonville, Florida

Post by Philip »

Yahoo search for "speedguide" shows about 780,000 results (noted, not all results are from our domain):
http://search.yahoo.com/search?p=speedguide&fr=FP-tab-web-t-334&toggle=1&cop=&ei=UTF-8

Yahoo is actually the second highest referring site after Google.
Post Reply