GDDiagnosisPage

    Keywords: Online Go

GODISCUSSIONS IS BACK UP AS OF FEB 25TH

GoDiscussions is currently crippled with database or server problems. If you have any idea why, please post here. ( To discuss rebuilding/relocating it, look at the GDRecoveryPage )


  • 11-5-09 Terr first mentions problems.
  • On 11-5-09 Ross wrote:
    • There were a couple of hosed database tables that I was able to ssh in and repair (that seemed to be the reason for this morning's outage). Even now, though, like you I'm seeing weird 503 outages but they come and go (which makes me think it's not a database problem). Not really sure what's up ...
  • On 11-24-09 Robert Jasiek contributed the following data:
    • There is a strong empirical relation between type of user action and frequency of errors. Therefore it must be mainly a software problem.
    • Error probabilities:
    • >99.9%: In an Invalid Link error message, clicking on Click Here (or whatever the exact words are).
    • >90%: Having edited a reply for more than 20min and clicking Send.
    • >50%: During the last few days, clicking anything and getting Service Temporarily Unavailable.
    • >50%: Having edited a reply for more than 5min and clicking Send.
    • >20%: Having edited a reply for just a few seconds and clicking Send.
  • On 11-24-09, Kex wrote:
    • Web pages are typically downloaded a few (or more)files at a time. E.g. apache has the parameter MaxKeepAliveRequests? that says how many files can be downloaded before the server shuts down the connection. If this limit is exceeded, the browser sets up a new connection to download more of the elements on the page.
    • Now, my guess is that:
      • The maximum number of requests per connection for this server is pretty low
      • There is some kind of a rate limiting function on the SP's server that prevents too many downloads for a given service level bought by the site owner. This rate limiting function then refuses to serve many of the requests, leading to this behaviour.
  • Maenashi then questioned:
    • If it was rate limiting, why would it let you see the page after hitting it with a few requests? The net result is expending more bandwidth for the same data to be transferred.
    • It might still be a resources problem, but it seems rather strange to have it behave exactly this way as a way to limit bandwidth use.
  • To which Sumiyaka responded:
    • Because its rate limiting for a given period of time. I can't remember exactly, but I think its files per second in the apache config. This could be part of the 503s?
    • In kex's guess its rate limiting for a given request. The reason it would pop after reloads up is because stuff will get cached between your reloads... and therefore not requested on subsequent visits.
    • My guesses:
      • My first thought of a dying disk probably isn't right or we would have a dead server by now.
      • My second thought... we have been rooted, probably isn't right or google might have us flagged (again) by now (still not ruling this completely out)
    • kex's guess sounds very good. Its a good chance it is a resource cap.
  • Still on 11-24-09, Fwiffo wrote:
    • I don't think it's the forum software (because 503 errors are occuring on static files like images too), and it's not likely (but not impossible) to be a resource cap. But from this end it's all speculation. A quick walk around /var/log would answer the question with actual facts.
  • Koch noted:
    • GD is hosted on one of GoDaddy?'s budget deals, which means it's very likely a hosting issue. Shared servers on a these low-end accounts are notorious for these kinds of issues.
  • On 11-25-09, Ferl suggested:
    • ...maybe you should play the Malkovich games on OGS or something...
  • To which JoazBanbeck replied:
    • It is not the Malkovich games that are causing this. Consider the following timeline:
      • Vap and I played 73 moves from May 3rd to June 4th. That is as average of 2+ moves per day. We had so many viewers that we quickly became the most popular thread in the beginer's forum. There were no 503 errors.
      • Sol and I started on Sept 8th, 2009. Initially, when Sol was not going to school, sometimes we played 3 or 4 moves per day. We had so many viewers that our game surpassed the first one and became the most popular thread in the beginnner's forum. During that time, there were no errors.
      • Will and Aphelion started the third game on Sept 25th. Sol and I continued to play. This went on for more than a month. Even with 2 games going, there were no errors.
      • But the errors began on November 5th.
      • The fourth game started on November 6th, AFTER the errors began.
      • The fifth and sixth games obviously started afterwards also.
  • On 11-28-09, Koch continued from his previous post:
    • Hosting companies can, and often do, choose to throttle back the bandwidth usage of a domain that is using more than its "fair share". This is very common on low-end/budget plains that utilize shared servers. Timing is not necessarily a good indicator of this. Hosting companies normally will not throttle a domain's bandwidth immediately but wait for a certain period of time before throttling, i.e., after it becomes clear to them that a site's high bandwidth usage is not just a short-term spike.
  • On 11-29-09, Kirkmc countered:
    • I highly doubt that it's a bandwidth issue. I have four domains hosted with a cheap hosting service, and I have unlimited bandwidth. If Don does indeed have bandwidth limits, he should change servers, but I doubt that's the problem. When I've seen severs that do have such limits, I see errors saying something to the effect that the site has gone over quota.
  • Wildclaw then responded:
    • It could be some kind of concurrent connection limit to the server or database. However, the thing that speaks against that is that the error appears even when there aren't that many people reading the forum. Also, it is important to remember that all of this didn't start randomly. On the fifth of november, godiscussions had some real problems.
    • And recalling Ross' earlier comments:
      • There were a couple of hosed database tables that I was able to ssh in and repair (that seemed to be the reason for this morning's outage)...
    • He concluded that...this...highly reduces the probability that all of this is some kind of ordinary "hosting cap" problem.
  • Koch continued on the 'unlimited bandwith issue:
    • He suggested that we look at [ext] http://www.thinkhost.com/services/nt...ng-traps.shtml
    • and quotes some for us
      • Bandwidth "throttling" and load balancing:
      • It is important to ascertain whether your host will decrease the availability of your bandwidth based on server activity. This isn't a decrease in your quota itself, but a "slowing" down of how fast this bandwidth can be used. Bandwidth throttling can result in your entire web site being slow to load. Excess throttling can mean that some of your visitors may not be able to access the site.
  • On 12-03-09, Ethanb replied:
    • It cannot possibly be a bandwidth issue unless the database and webserver are on different boxes and the hosting company charges for bandwidth of data transferred between the two. Which is quite unlikely.
    • 503 is an error in the code on the site (as with all of the 500-series errors); most likely database access issues.
    • Hosting providers don't throttle bandwidth - you're thinking of Comcast. If the bandwidth quota were reached we would be getting server timeout issues, because there wouldn't be a route to the host.
  • On 12-5-09, Kirkmc added:
    • Of course it's not a bandwidth issue; this is a forum, and we're in 2009. Most hosts don't limit bandwidth or throttle bandwidth, as some posters said, unless you really get up into high numbers. Forums, even with the graphics this one has, are very low on the bandwidth scale. The problems here are deeper, most likely something to do with access to the database. It doesn't seem like it's bad code, because if it were the problems would be reproducible. When I get a page that doesn't load, it often works when I reload. However, some of the graphics are missing often, which could suggest issues other than just with the database.
  • Your comment here???

This is a copy of the living page "GDDiagnosisPage" at Sensei's Library.
(OC) 2011 the Authors, published under the OpenContent License V1.0.
[Welcome to Sensei's Library!]
StartingPoints
ReferenceSection
About