The Witcher

Steam sale and downloading issues prompts CDN expansion

  • Comment
Over the past 10 days we've had widespread reports of downloading issues on the sites that has gone hand-in-hand with the annual Steam summer sale promotion that sees games getting massively discounted on Valve's gaming platform. These downloading issues were caused, simply put, by the fact every single one of our 20 download servers was filled to capacity with people trying to download.

If our registration statistics are anything to go by this year's summer sale was the most successful one yet for Steam. Over the past ten days we've averaged 8,200 new registrations a day including a new Nexus record of 14,505 new members in a single day beating the previous registration record set on November 26th 2011 of 13,570 new members just a couple of weeks after Skyrim's launch. Typically the Nexus will average 3,500 - 4,500 new registrations a day when something special isn't going on.

When you have a huge influx of new members in a short space of time this has quite a detrimental effect on the file servers. While you can typically only browse the site one page/tab at a time, which helps us maintain our resources on the web servers, you can have many downloads running at any one time. The inherent problem with having a huge influx of new people is that their downloading habits are different to "regular" users. As a new user you want to download a lot of mods all at once. You'll go through the top 100 and look up "best mod" lists on the internet and try and download as many as possible. As a "regular" user you've already done this, your mod list is pretty set, and you're now browsing the Nexus to see what's new, perhaps only downloading one or two new files a day to augment your current mod lists. So having a huge influx of 14,000 new users in a day is like adding an extra million regular users to the site over night for a short term period. The result was 20 file servers all serving 400 concurrent downloads each which meant during the Steam sale we were serving 8,000 concurrent file downloads at any given second and maxing out a 10Gbit line. That number would have likely been much higher if it weren't for the hard connection limits we've set on the servers. Hopefully you can appreciate that's a lot and the infrastructure you need to handle that has to be extremely powerful and resolute. While our file server infrastructure is powerful it's typically designed to handle around 6,000 concurrent downloads, and we average around 4,000-5,000 on a normal, non-Steam sale day.


Question: Why has this only become an issue now?

Aha, here's a silver lining (ahem). The reason this is the first time we've maxed our file servers is because this is the first time our web servers (the servers we use just to display the sites) have held under all this traffic. Secretly (ahem), we're patting ourselves on the back that the sites themselves were accessible for practically the entire Steam sale week, which means our new Cloud setup and centralised database cluster is finally working. We're obviously not happy about the file server setup so we're working to sort it out.


Question: Why weren't you more prepared?

I thought we were :)

Back in January I posted that we had completely decommissioned our file server setup and we were moving from a 15 standard download server setup to a 20 standard download server setup, an increase in capacity of 33%. The inherent problem was, because our web servers always used to fail before the file servers did it meant we'd never thoroughly tested our file setup under extreme load conditions. Now that the web servers are up to scratch and holding under these conditions the file servers are taking on a lot more load. And so now we can react.


Question: Why didn't you just buy more servers when the Steam sale started and it became apparent the load was too much?

The file servers we need can't just be requisitioned overnight. They need to be ordered, delivered, plugged in and have all the firmware and updates applied before we can even get the entire file database copied on to the drives. That takes time, more time that the Steam sale was going to last.

Picture the situation like a huge rock festival (lets take Glastonbury as it's only just finished) that comes to a very small town (population just under 9,000) in England once a year. 361 days of the year the local road infrastructure is completely fine, but 4 days a year, when the Glastonbury festival sets up in nearby fields, the roads are completely choked full of cars and the local residents can barely get out of their own town. Is it prudent for the local council to build an 8 lane highway to support a 3rd party event that may or may not happen from year to year that will only be used for 4 days of the year? I think not. In a similar vein, we'd be talking an extra $5,000 expense each month, minimum, to accommodate an event that happens once or twice a year.We can't just say to our server provider "we want these servers during November/December and June/July but for the rest of the year we don't want them". Contracts have to be signed and so on and so forth.


Question: So what are you going to do about it?

Last year we spent considerable time, effort and money to sort out our web server situation and we moved to a much more flexible cloud and cluster setup. This has worked. It now makes sense that we continue those efforts and bring our file servers inline with the cloud ethos.

We're currently in talks with a big CDN service, who already partner with big video game players like Steam, CCP and Wargaming, to get rid of our current dedicated file server setup and move our entire file serving efforts on to a CDN.

If you don't know what a CDN is I won't bore you by going into detail about what it is (a simple Google search will surely enlighten you!), but I will bullet some key advantages it will have over our current setup:

  • Flexibility and scalability. There's practically no limit to the resources we can use and there's no time delay in making use of them, which means no bottlenecks. We contract for a set amount of usage and any overage due to one-off events, like a Steam sale, is charged at a standard and competitive rate.
  • Less administration and more secure. Maintaining 27 file servers (20 normal, 3 Premium, 4 static content) is a huge undertaking that requires a lot of server administration to keep up-to-date and secure. Moving to a CDN places this responsibility in the hands of a team of qualified individuals who are much better suited for the job, freeing us up to both not worry as much, and not work as much on this issue.
  • Increased performance and localisation. We currently have 14 download servers in the US and 6 download servers in the UK, but the Nexus has a global reach with many users from South America, Asia and Oceania. CDN networks have data centres distributed across the globe that should ensure you really will max out your connection when downloading from our servers, hopefully, irrespective of where you are in the world.



Question: It sounds good, so why haven't you done this in the past?

Partly because it wasn't necessary and partly because it costs more. Between 30%-70% more than our current dedicated file server setup depending on how much bandwidth we use. We've come to the realisation from our work on the cloud and cluster setup that this really has to be the future for us, and the added cost, although tough, is necessary to secure the future of the sites. We need to be able to move fast during these sorts of situations which is something we cannot do with a dedicated server setup.


Question: When?

As soon as possible. We're testing out the feasibility of the CDN for our setup as I'm typing this.

159 comments

Comments locked

A moderator has closed this comment topic for the time being
  1. SeRidicaLup
    SeRidicaLup
    • premium
    • 24 kudos
    I'm getting the same problem as Skyrim Junkie5577. My files download fine (either using NMM or Manual) until the last 1%. Then it seems lock up and can't finish. If I try to resume, the file is corrupted. This is true for any file on Nexus Skyrim. Sometimes if I keep doing it over and over again, it will work. But It's like trying to do something if you tilt your head just right and hold your tongue just so LOL.

    I am running Win 8.1 no google chrome using IE. All updates are current. Popup blockers turned off for this site.

    Not sure what the problem is. Been having these issues since the summer. Can't remember if it started at a certain NMM update, or since I got win 8.1
  2. SkyrimJunkie5577
    SkyrimJunkie5577
    • member
    • 0 kudos
    I've downloaded many mods, over 150, with no problem. The last few days I keep getting 
    Error trying to get the file: http://filedelivery.nexusmods.com/110/1000096161/Harvest Overhaul 2_8_2_0-16553-2-8-2-0.7z?ttl=1417401229&ri=8192&rs=8192&setec=9487e67c89ee2ba80081786999e436cb
     
    for every mod I try downloading. Even manually downloading just takes me to a blank page. I tried logging out and back in and refreshing the page. Not sure what else to try or if this is even the right place to ask about it.
     
    http://filedelivery.nexusmods.com/110/1000096161/Harvest%20Overhaul%202_8_2_0-16553-2-8-2-0.7z?ttl=1417401846&ri=8192&rs=8192&setec=6b2bea89711b500ee574e8f0d6487d98
    Is what I get after I click download manually.
  3. wretchedfetcher
    wretchedfetcher
    • member
    • 1 kudos
    Thank you very much for all your hard work.
  4. dalec
    dalec
    • premium
    • 8 kudos
    Thanks for the update, and glad to see it's all sorted. And great to see the Nexus has grown hugely.
    More people Now Understand that PC games ARE hugely supported by the PC Community, who also make the games have a much longer and fun life.
  5. yangjiehui
    yangjiehui
    • member
    • 0 kudos
    Thanks for the informatin
  6. bben46
    bben46
    • premium
    • 781 kudos
    I have seen one post claiming that changing their DNS server made a huge difference - Not sure I understand why, but it might be worth testing,
    Instructions on how to do that here, https://developers.google.com/speed/public-dns/docs/using#setup
     
    EDIT: This has since been debunked. I got back about 20 replies - it may have worked for 2 of those - and not worked for the other 18 or so.
  7. GreenCultist
    GreenCultist
    • BANNED
    • 0 kudos
    By far this is the best update ever, Monster Mod and Monster Wars for Skyrim downloaded in 10 Minutes... Unlike before that it either stopped downloading or toke 1 to 2 hours...
  8. Ohpus
    Ohpus
    • supporter
    • 3 kudos


     
    In response to post #16822349. #16823579, #16851379, #17267334 are all replies on the same post.
     

    tidus - there are lots of us with decent broadband connections who are having download issues, from interruptions through to "server unreachable" messages. I am so pleased to hear you are perhaps the only one that does not suffer it.
     


     
     
    I just started re-modding Skyrim and am going from 100 to 250 mods. The unreachable servers definitely prompted me to contribute. For how long depends on a number of factors, but primarily on whether this actually fixes the problem.
     
    BlueGunk, I found that manual installation is the way to go anyways with some mods being taken down or files becoming unavailable due to modders being banned. But that is a subject for another thread.
     
    In addition large-scale modding of Skyrim rarely works smoothly the first time. I am on my second Skyrim reinstall and my second mod "clean wipe" reinstall.
  9. Slavemaster4u
    Slavemaster4u
    • premium
    • 16 kudos
    On July 13 people were complaining about download speeds. Now it is July 23. Still can't reliably download anything. As for now being able to choose server. 1) Before server change over I was downloading from a server in San Jose which is a 1 hour drive from my home. That server isn't even in the list of servers. Busy or available. 2) The only choices I'm given are Salt lake, or England. And off them, the only time I'm able to complete a download, is before work around 4am PST. Even then a 127mb file takes about 1 hour to download. Which is OK because it gives me time to sh*t, shower, shave, and eat breakfast. I have 11 mods needing update. Should be able to get them all by Aug. if all goes as it is. )
    1. tidus87lion
      tidus87lion
      • supporter
      • 0 kudos
      I have perfectly fine speed so I don't know what your complaining about. Perhaps instead of blaming Nexus you should get a stronger internet connection and not download everything at once.
    2. Slavemaster4u
      Slavemaster4u
      • premium
      • 16 kudos
      #1) I'm hardwired to a high speed 5mb/s Comcast connection. #2) I'm downloading a single mod at a time. Trying to get a total of 11 mods. And most of these mods are updates to mods I already have. #3) The total size of all 11 mods is 1.4gb. the largest is 127mb.
      But to update my above statement. As of this morning I downloaded the 127mb mod and the speed was considerably faster. Was getting average speed of 210kb/s. Not as good as before, but much better than 33-60kb/s before this morning. And the finished file was not corrupted.
      In closing, to tidus87lion. A wise man remains silent, keeping everyone guessing if he's ignorant, while a fool speaks loudly, leaving no doubt to anyone, of his ignorance.
    3. Fallerent
      Fallerent
      • premium
      • 1 kudos
      I don't assume you have poor internet connection or are using servers irresponsibly, but god d*mn, give the Nexus a break. They don't exist to meet your demands, I assure you that if they could keep everybody happy all the time they would. The capabilities of a team's hardware are limited and they aren't going to spend increased amounts of money because you said so. This would result in the Nexus becoming financially unprofitable and being closed or sold. In closing, to Slavemaster4u, don't bite the hand that feeds you.
    4. BlueGunk
      BlueGunk
      • premium
      • 104 kudos
      tidus - there are lots of us with decent broadband connections who are having download issues, from interruptions through to "server unreachable" messages. I am so pleased to hear you are perhaps the only one that does not suffer it.
  10. thetorturedbodysoul
    thetorturedbodysoul
    • premium
    • 3 kudos
    Does anyone else seem to be having issues with even the NMM interrupting downloads frequently, causing constant need to resume it? I've been trying on and off to download some large files, named NMCs large texture packs 1-3. I only try downloading one at a time, but where normally I would download at 100-200kbs, I can only get tops 75kbs now, and normally only about 45-70kbs. Add this to the frequent interrupts, even through the NMM, which sometimes rarely cause the download to corrupt and stop the download permanently, and trying to download anything that is 1gb in size such as the texture packs is just not feasible. I also seem to notice that when trying to resume the download after it has been interrupted with an error, that I get messages saying that 'server is busy/unavailable...retrying' quite often. Just wondered if this was my crappy DSL internet, or if other were experiencing this and is related to the issue with nexus servers?
    1. rimere
      rimere
      • member
      • 1 kudos
      I'm getting the same kind of problem. Although with Skyrim mods it isn't so bad (I get pretty stable downloads, although with larger files there are a lot of times the download stops and I have to manually resume it,) with Fallout NV mods I get, at tops, <15kb/s of download speed with larger files. Right now I'm trying to download FCO and I'm getting 9kb/s. Nine. And it keeps stopping.

      I'm really relieved to see that the issue is (hopefully) going to be eliminated soon with the move to CDN servers.
    2. SgtJenz
      SgtJenz
      • premium
      • 81 kudos
      Same problem. Resorted to "Download Manually" which seems to work ok.
    3. BlueGunk
      BlueGunk
      • premium
      • 104 kudos
      Yep - I often have to click to re-jig the downloads (Skyrim). More frustratingly I am now frequently receiving "server unreachable" returns. So much so I am downloading manually and installing from my own folders.