Jun 28
Ben KingNetworks Systems
The scenario…
Dell 2800, 2x36Gig (RAID 1 – Systems Boot)), 3x146Gig (RAID 5 – Data), primary file server, primary AD controller.
12PM Yesterday Dell Server Manager began reporting a fault on disc 1 (36Gig), the first step in this scenario is always to reseat the disc to rule out any possible connection issues.
This indeed brought up both drives again, only to be followed 5 minutes later by a failure of both drives in the set.
Rebooting the server to the RAID BIOS revealed that there were faults found on disk 0 not disk 1 (as reported by the Dell Server Manager).
Lesson 1 – never underestimate the power of the hot spare.
At this current moment in the time the company was without file, print and primary AD.
We decided to remove the disc 0 (that the bios said was faulty), and bring the disc 1 (that at bios said had no faults), back online.
Unfortunately a reboot in this scenario forced a windows chkdsk, which found faults on the drive. Although the system booted, AD did not come up and system was effectively broken.
We had no spare chasis available to drop the data disks into, so we were in a fast reinstall situation.
The first step however was to seize the FSMO roles using NTDSUtil to one of our backup AD controllers, and cleanup all references to the old primary AD.
While these changes were propogating through AD we began the reinstall process. As we had no spare disks we were forced to reinstall to the one disk we thought was good.
All went well, and we were smart and didn’t repromote the server to AD yet.
Lesson 2 – only bring a server up as an AD controller when its a known good.
Unfortunately later that day the server failed again. Fortunately we were able to bring the disk online again without a problem.
The next day the first new disk arrived, we took the decision to not install it yet as we felt mirroring might force the remaining dodgy disk to fail…
… which it dutifully did.
We took the decision to then install the spare, allowing the RAID BIOS to resync the drives without rebooting the OS. We felt that we could have done it on the fly with OS running but that gave us a greater chance of the mirroring failing.
It took about 40 minutes for the mirror to take place.
We then removed the dodgy mirror, leaving us running on 1 disk until a further warranty replacement arrived.
Lesson 3: Always get disks of different ages, batches, brands if possible, two disks from the same batch can fail at the same time.
Sigh…
Jun 28
Ben KingBusiness bit10
Ever wondered what it looks like inside bit10…

Steve (one of our designers), knocked this rather excellent isometric design of our office.
Click the image to see a full size one.
Jun 23
Ben KingBusiness Business
As part of bit10′s evolution to the ‘next level’, we have been putting a lot of focus on the profitability of the business, where we make profit and what we make the most profit on (after all we are not doing this entirely for love!).
After 9 years of business bit10 prices are largely based on a combination of:
- Competitor analysis
- ‘Fag packet’ guestimations of cost
- Gut feeling
Never have we done any detailed analysis of acutal cost to the company.
We are making profit, why is an understanding of cost so important?
At a macro level the answer to this is pretty obvious, if you have no understanding of where the costs are on a particular service or project, how can you know what the profit is.
However the benefits on a micro level are more subtle.
It is really a question of empowering our people to do their jobs better, with less interaction with the people above them (and the subsequent delays), some examples:
1) Sales – I would regularly get calls/emails from sales with questions like ‘Can I discount this 10K project by 10% to get the sale?’, or even worse ‘How much can I discount this 10K project by to get the sale?’
2) Production - I would regularly get calls/emails from members of the production team with questions like ‘Can we buy software X to make project Y easier/faster/better?’, or ‘We need to do X days for work on project Y due to a problem, is that okay?’ – even worse than the latter question is hearing in hindsight that a project has overrun.
In all the cases above I would be in the position of using my gut instinct, based on years in the business, to make the decision on the question.
This is clearly no way to run a business, especially when in order to grow there is a need to empower your people and not unnecessarily burden yourself with questions you cannot actually answer.
Therefore it becomes obvious that having a method to accurately apply the real cost to projects is essential, and that the methods are accessible and understandable by all people involved, including sales and production.
What next?
The next step was about working out the real costs. I will be posting some followups to this blog with details of how we went about it.
Jun 14
Ben KingLife Life
Well i am finally entering the brave world of blogging, and frankly its with quite a feeling of concern and trepidation, at this stage I am not entirely sure what I am going to end up blogging, however I have some ideas…
I want to use it for both the recording of the more exciting activities in my life and a way to share what I learn through my work.
I work as a senior director at bit10 ltd. (www.bit10.net), where my role involves everything from day to day decisions about the company to being out on the road selling to systems administration.
So with that briefest of introductions… lets see what happens…
Recent Comments