Archive for January, 2011
Summary of how the crash happened.
For those who are interested I believe it is important to let you understand fully what happened and how things will improve going forward.
- The software system and business infrastructure was purchased in late October 2010 by the previous owners. The software system they built SPP on was a closed proprietary content management software system (CMS) that was built 5 years earlier. Because of its proprietary nature (i.e. not open source) there were no improvements made to the software. During the 5 years new more powerful and reliable systems became popular and that were also open source (meaning that it is continually updated which is ).
- When we bought the system we knew that the software needed to be replaced and the best course of action was to migrate to a new software platform once we figured out the clients requirements.
- On January 20th around 1 am EST, we were notified by the data center DataPipe, a top that their software indicated that our hard drive would need replacing in the future. They strongly advised and reassured us that they could do that in the middle of the nite seamlessly as it was hot swappable. Hot swappable means that they could replace it as the computer was still running as there are multiple drives setup (in a configuration called RAID) that allows for one of the drives to fail but still be operable. Having plenty of experience in datacenters, I know this is a riskless situation because when they take the one drive out, the other drives will take over and when the new drive is inserted it will rebuild on it’s own in a few hours. This is called redundancy particular to RAID hard drive arrays. Later to my chagrin I found out that the redundancy totally failed as they were using RAID 1 (2 drives) when they told me it was RAID 5 (3 drives).
- They phoned me later at 5 am and told me that the new drive they replaced in the RAID array was not reading and that the drive controller was not recognizing it so it could not rebuild. They also said the corruption was also on the remaining drive and that they were trying to recover it and needed to shut the server down.
- They further informed me within the hour that they weren’t having any success. At that time I made the professional decision to move all the websites to a new platform during the same day causing the outage. I originally had planned on doing one site at a time but this outage required us to move hundreds of site at the same time causing massive confusion.
- The data center had led me to believe on January 21st that the data could be recovered and the old system put online. Later on in the day they informed us they couldn’t recover the drives themselves and that it would be sent to a data recover specialist firm. They apologized profusely for this. I then asked them for the backup files that I had asked them to do daily and the informed me that there were no backups. Despite my requesting in October to setup a backup system they had failed to do so.
- Currently we have restored all the websites and restored them based on the backup we performed manually in October. At this time we have website data only.
What about Data – Email and Web?
- We have restored most of the websites and are fixing the last parts such as picture galleries, forms and some coding issues. This will take a week to 10 days to complete though most customers are already completely back up.
- The hard drive has been sent out to a data recovery firm that will do their best to recover the data up to the time it came down on January 20th – 2 am. We will do our best to recover your emails and files. We will inform you once this is done and what data can be recovered including emails and files.
What about the future of SiteProPlus?
Although there was business interruption which we wish never happened, we can assure you with 100% certainty that you are moving ahead with a far better setup than in the past. Here are the reasons:
- We are able to dedicate dedicated manpower to providing the best and most reliable technology solutions around. As we are focused on our content management solutions business, weo stay away from proprietary software for general purpose websites as it is not necessary. This will be beneficial and a huge improvement over what you had previously experienced.
- The system has now been migrated to a new datacenter with multiple forms of redundancy including near real-time backup solutions. I’ve been using their technology for 10 years without substantial downtime (greater than 1 day). They have offsite backups in case of fire and multiple forms of redundancy.
- We are moving websites over to the most reliable system around that will allow customers to do the following (unlike the previous system):
- Perform their own backups anytime they wish through their Cpanel.
- Configure and have their choice of 3 different webmail options.
- MUCH more reliable email system that never goes down unlike the Qmail used before.
- Provide for multiple options of website editing including direct FTP access, WYSIWYG editing and html editing.
- Fully available to configure using CPanel technology.
Again we apologize for the business interruption and we will be in contact to make sure everything is back and improved.
Dear Valued Customer,
Thank you very much for your patience. All websites are back online.
Please note the following:
- If you are having issues with picture galleries or funny text please send us an email with the exact URL (page name) so we can fix it for you.
- If you require a new email to be setup send us the email alias you want and the password. We’ll configure it and send you instructions on how to setup on your outlook or use webmail.
- If you want to setup your own email contact us for your CPanel username and password and we’ll send you simple instructions. You may also download it here.
If you are having problems with email still note the following from our tech team:
Please note that your computer may still have old data stored in its memory. This data typically clears after a few days. Please DELETE any bookmarks/favorites you stored in your browser for access to the old system. An alternate way to login, is to type: yourdomain.com/webmail in your address bar. You should get a pop-up window asking for your username and password. Be sure to enter your entire email address as username, and the password provided in the password field.
Setting up and Editing your own Website.
To edit your own website you will need you username and password to Cpanel. You can then download simple instructions here
If you have any problems still please contact us by email at email@example.com
The new email system is rocket proof unlike the one on the old server. If you want to access and control your email download the attachment here. If you don’t have your Cpanel access, send us an email to firstname.lastname@example.org and we will send you the information.
Thank you very much for your patience. We are doing our best as we do have limited hands for this huge undertaking.
- If you use the email with the domain, please make sure you have notified us of the email you wish to setup. We’ll send you further instructions on how to change passwords, access webmail etc.
- We are restoring each site manually from backup, this process takes time, as it is not automated. If there is anything wrong with your website after its restored please send us an email as it will be much faster than phoning us. If you call me (1 888 256 3096) I will do my best to answer.
- Once we get the sites backup we’ll get in contact with you regarding instructions on accessing and editing the websites.
We will do our best to handle your inquiries in a timely manner. If possible please email us first at email@example.com, I will respond to your inquiries as soon as possible. Please don’t hesitate to keep emailing me as we are a bit disorganized with so much information to restore.
Dear Valued Customer,
We have had an emergency issue with our server. The data center that we use, DataPipe, had some hard drive issues which they tried to rectify on the morning of January 19th. Unfortunately they unintentionally caused a permanent corruption of the hard drive. As a result we are moving our business away from them for obvious reasons. They also corrupted the software system that SiteProPlus was operating under. Although we have backups of all your files, the old SiteProPlus software system is not worth salvaging as it was dated and needed to be replaced. If you already have a custom site with us you are not affected as you are already on a new system. If you are curious on the technical issues causing this, please refer to http://blog.siteproplus.com/?p=71
As a result our priority is:
- we will need to get your email up
- rebuild your site from our backups on a new content management system.
New WordPress Content Management System (CMS)
As the old SiteProplus.com software system was built 5 years ago we have decided to upgrade to a new software, WordPress, http://en.wikipedia.org/wiki/Wordpress, that is superior in every way particularly in terms of usability and for search optimization. This platform is always updated by its developers as it is one of the most popular software systems in the world for managing websites.
To transition the process, here is what we’ll do:
- 1. First issue is restoring your email:
- If you are using our email please send me a reply to Wilson@mindbay.com to let us know the following:
- Your email address aliases and passwords. If you don’t know your passwords that’s fine we’ll set a default to you. For example, firstname.lastname@example.org, email@example.com
- Once we receive this we will reconfigure your email and send you instructions on how to access the webmail and setup your pop email account.
- 2. New Website rebuild
- We will be recreating your website from scratch based on our backups on WordPress.
- This will start immediately but will take some time as we work our way through the queue.
- We will build it up to the same or nearly the same as it was previously based on the files we have. This situation is similar to transcribing a written document into a digital document by hand.
- Unfortunately at this time we cannot process custom design requests or modifications due to the volume. We can get to this later.
- While your site is being built we will put together a landing page with your email, phone number on it and say that the site is being rebuilt and will be brought back up.
- Our designer may contact you by email or phone to fill in the missing details on your website if any during the recreation process. Her email is firstname.lastname@example.org
- Your new website will be better, easier to manage than ever before. We will be training you on how to use this system as it’s far easier to use.
- 3. How long will this take?
- Email – The first step is to repoint your domain name to our new servers. This will take at most 1 day as the changes will need to be propagated.
- Website - This will be queued up and we will get to your site up as soon as possible. During the downtime, you will be compensated.
- 4. Compensation
We sincerely apologize for this situation. We don’t want to lay blame here but this is a situation that happens once in a blue moon, that was totally unexpected. Due to the inconvenience we have caused we will be doing one the following:
- Adding an extension to your account of at least 1 month or refunding you at minimum 1 month’s fees.
- Providing you with a $200 credit on any of our internet marketing services.
You will see that the new system is much improved. We apologize that this had to happen but this was simply beyond our control. We accept full responsibility and please be patient with us as we rebuild your website.
As you know we bought the business 3 months ago and have been trying to update the old software system. Thank you for your patience and understanding as we work through this.
- 5. Contact
As we have limited manpower please be patient as we can handle only a certain number of inquiries at a time. My interim contact is:
- Email: Wilson@mindbay.com
- Design: email@example.com
- Phone: 1-866-256-3096
- Blog: http://blog.siteproplus.com/?p=67 // please feel free to comment on this.
For those that are curious and are not knowledgeable about how the website/server/datacenter relationship works here’s an overview.
- Servers (high end computers where your website is hosted) are hosted at datacenters where technical experts manage the hardware and maintain connectivity.
- Our work is domain name management, software upgrades, support, design and services.
- The data center we use is one of the biggest in the world and they try to preemptively address failures such as hard drive failure. Because the server they used was dated, they attempted to update the hard drive which caused a conflict with the hardware chassis. The hard drive couldn’t be recognized with modern chassis. Although they tried to restore from their backups the software system that SiteProPlus operated under was dated and they didn’t have new hardware that worked with the software.
- We had to make a professional decision whether or not to continue working with the current system which would cause continual downtime or to upgrade all at once. We chose the latter and the cost is extremely steep for us since we are absorbing the costs of the site rebuild which is very substantial.
Can be read here:
Can be read here.