Author Topic: Recommended (hardware) configuration for a basic CCM™ installation  (Read 1451 times)

0 Members and 1 Guest are viewing this topic.

Offline eco

  • Global Moderator
  • Newbie
  • *****
  • Posts: 16
  • Karma: +2/-0
    • View Profile
A CCM-Server™ will run on basically anything.
Most kind of common servers in mission critic serverhosting situations will work perfectly out of the box. That is the way we designed CCM™ to work and it makes life easier for us and our customers and the users.

How ever; Often when we are asked the question: "what do I need", the question should really be: "what knowledge do I need in order to make this work in the long run and maybe I should consult an expert to get us up and running?".

Not meaning to insult anybody here but it IS difficult, complex and knowledge intence business to run servers and be a systems administrator. Running data centers is not for eveyone and if your client does not have the time, knowledge and experience needed to setup and manage a data center (server room, equiv.) and to keep it safe and up and running it is probably better to let us set up the machines in one of our data centers and manage it for them. It will definately be easier and more cost effective than trying and failing at running it themselves.

With that said here is what you need:

Hardware:
...depends on the planned load. More RAM is better. If you have to prioritize one of memory and cpu memory is probably better in the long run, preferably as fast RAM as possible, since the CPU is "only" used when deploying a new host or changes.
For a single system that is going to serve one or more websites with little to almost no traffic at all we would recomend no less than 16GB of DDR3 RAM. 1600 Mhz will work fine if you can choose between different speeds.

If you have a system that can do CPU throttling, ie: changing the CPU speed in real time to meet demands you will lower your power consumption alot. Normally a system running CCM™ does not need eight cores of lots of Ghz and if your system can scale down to lets say 4x1.6 Ghz this will work fine for all the between 12 to 23 hours a day when you don't use all your CPU power. Also always leave at least 30% over capacity. Never use all the "juice" in your systems all the time. You need some margin for heavy load.

For a shared server (serving several different customers/websites/domains etc) you will need no less than 30% over capacity. What would happen if two of your clients simultaneously get very high peaks, maybe they both have marketing departments that push hard for a seasonal campaign or similar. You need to have extra power available at an instant at any time.

Therefore our recommendation is to always run CCM™ in "pools" where two or more servers to exactly the same job. With this configuration you can use a load balancer such as the Nidavellir Firewall™ to let the systems share the load under normal conditions but you can also take down one system for service without the end users noticing anything at all. Failure such as hardware related problems would also cause the one member of the pool with a problem to be kicked out of the pool immediately and automatically. This will greatly decrease the work load for you system administrators and NOC/NCC.

A typical load balancing pool for CCM™ could look like this:

-> HA Nidavellir Firewall cluster (two or more firewalls)
 -> Frontend 1
 -> Frontend 2
 -> Frontend 3
  -> Backend

 -> Logserver
 -> Backup-server
 -> Storage-server

The Log-server and/or Backup-server and/or Storage-servers could be one either one single machine, a load balancer with two or more systems or you could have one server running both the storage and the backups and a separate logserver.

The backups needs to be handled in a secure way and you should make separate backups from the backup-server which ever solution you choose. RAID 5 with at least 4 drives is recommended for the log-server, backup-server and the storage-server. If you only use RAID 1 on the backend you could still make very good use of RAID 5 in one or all the Front-end servers.

You might want to add a separate database cluster (recommended)as well unless you want to run the databases on the backends and/or frontends (less secure)

The Front ends could run just one (software) server or could do more work for you. For example two or more functions on all the frontends would not decrease performance noticably but would make administration much easier for your certified CCM™ developers.
How ever many technicians prefer keeping all the data on the backend(s).

RAID5 will also greatly increase speed and data security for the other servers in the load balancer pool(s).
Of course not using either RAID 1 or RAID 5 will mean that all data is lost when a drive fails. And it will fail. Therefore at least RAID 1 should be used on all servers. Even the Nidavellir Firewalls could benefint from this.

In total you would need at least:

2 or more Nidavellir Firewalls (loadbalancers)
2 or more Frontend servers
1 or more Backend servers.

Either:
1 Log server
1 Backup-server
1 Storage server

OR:
1 Combined Log-, Backup- and Storage-server.

Also you probably would want to have:
1 or more database servers (minimum 2 for a cluster)


Optional:
1 or more CNS Resolver servers (this could also be a service on your firewalls).
1 or more bastion servers used for monitoring the systems and making administration easier.


It is always difficult to recommend a specific type of hardware since the system resources needed depend on what you are going to use your servers for, how many users you are going to have using CCM™ etc. Maybe you need to cluster other services as well etc.
If you are not buying the hardware directly from us it is always a good idea to use only quality brands and to get more powerful systems than you need today since upgrading will take time in the future and might cause downtime.

Xeon processors is always a safe choice. Power hungry: yes; expensive: yes. But they will perform good in all different situations.

One or several CPU's? This depends on a few things: for example you might want to only use one processor per server to keep costs low while distributing load over several (load balanced) servers in a cluster. More CPU's than one in a single unit server might make your system more likely to overheat. Usually one CPU is good enough for a normal production environment and maximum two cpus in a 2,3,4 or 5 unit server is better (because of the airflow). Some systems with 4 or 8 cpus demand special cooling. If you are going to put your server in a basic rack with passive cooling (no cooling in the rack or cool air from the floor) or a blade server you might have to use fewer systems or increase cooling capacity. In normal room temperature (12-27 degrees celcius) you should be able to stack 1U systems with one cpu in a 42 unit rack and still not trigger any alarms in the servers. If you do you have bad cooling that needs to be fixed or you need to put some of the systems in a data center at compartment.  If you have 2 or more cpu's per server you might not be able to stack a full rack full of your servers or you need to add some tower servers or just leave some space in the rack. using tower servers only or mostly is - heatwise - not as big of a problem but of course it is a huge problem when it comes to wasting space in your server rooms. Tower Servers (all kinds of servers that are not 19", ie mini tower, midi tower etc) are generally easier to keep cool but considering you can use the same space for at least 4 servers it is not cost effective in any way to use them.


You want your hardware to be as fast as possible, any bottle neck will cause problems that might stack up and add more and more problems. Lots of fast RAM is good. Maybe you only need 16GB right now but make sure your system has free memory slots so you can add more memory in the future. 4x4 GB will be cheaper than 1x16GB but it might be worth the extra investment costs to be able to perform faster upgrades in the future. RAM is almost always the single cheapest upgrade for a system so the general recommendation would rather be to max up your system RAM right away.
ECC or non ECC is up to you. Sometimes ECC will be a bit slower. Most modern servers can handle a bit of "bad" memory here and there every now and then so it might not be a problem for you to use standard value RAM. Kingston is a good brand but most brands are good. Typically this will depend on what hardware you are using. Some servers ONLY work with obscure, uncommon and expensive RAM that might go out of stock in the future - check your documentation, if in doubt: top up your RAM right away and don't worry about not beeing able to order memory upgrades in the future.

Keeping spare parts in stock? This is up to you. Some components are cheap to keep in stock whereas some components might be really expensive and just be unnecessary costs if those spare parts are never used. On the other hand some parts will be really hard to even be able to order after a couple of years. Therefore using standard components and stocking up on those things that are most likely to break (hard drives, CPUs, RAM, power supplies) after a couple of years is a good idea. Avoid using cheap components (this includes equipment such as switches) - it will only cause you problems, problems that might cause even higher costs than getting proper stuff in the first place.

A switch that costs five to ten times more than another switch will almost always be worth ten to twenty times more in real life; No weird "break downs" than need days or trouble shooting before you can isolate the origin of the problem and long life times gives you both a better return of investment AND a lower mtbf and less need of technicians interacting with services and systems which in turn could very well mean more down time for end users etc. Buying cheap will cost you.

With that said we would like to recommend a cheap solution (or rather cost effective) for saving money and decreasing risk of problems occuring. The really expensive problems are often related to data. Data security is one of the most important things to do right when we are talking about hardware but the software and routines handled by humans is at least as important. With bad know-how on for example how to handle the backups it does not matter if your backupserver is using expensive RAID 5 controller cards or not. Without knowledge about how to take backups and how to reinstall from a backup copy there is no use spending time and money on managing backups.

This is what you will need (hardware):

A real raid controller (note that most low budget controllers are actually not real raid disk controllers, they are so called "fake raid" controllers, more info is available on serverbutiken.se).

Why? A good raid controller will save you money. A cheap raid controller will cause you more problems than having no raid at all. We know this. We have seen this many times. Integrated raid controllers on a mother board is at best in one case of a hundred usable. Some of our servers come with (real expensive) motherboards that actually have good (real) disk controllers.
The best thing is almost always to use a good separate disk controller. You want it to be able to run RAID 5 with at least 4 disks and you want it to have its own battery backup (small litium batteries that needs to be checked and changed every now and then).

To the RAID controller you of course connect your (at least 4) disks. These could be cheap if you want to change them more often and possibly save a few bucks or you could get quality disks that are faster, have a longer life time expectancy and typically will cause you less problems.

Modern SATA2 drives are good. Fibrechannel / SAS/ SSD could give you better performance. On some systems you need insane speed and be able to read and write really, really fast and lots of data at the same time. If your systems have A LOT of traffic or if you have lots of database queries this might be the case for you. Normally a backup server does not need really fast write speed since the backup could be performed at times when the load on your systems is lower and thus cause very little or no disturbance at all.

To increase speed AND security you could use RAID 1 and RAID 5 on top of each other or two RAID 0:ed sets of RAID 5 etc. There are many tricks that can be used to tweak things here and the best thing is always to first look at your production situation - what needs to be "fixed"? What demands do you have for the future? How do you work to day? How do you want to work in the future? etc.
Mirrored RAID 5 will be more expensive and need really expensive drives to give your systems the extra performance you need.
Maybe you could consider using load balancing instead or in combination with our measures to increase both security and performance?

Hot swap or not?
Hot swap or hot spare means that you can change a drive without having to boot down the server. This will greatly improve your uptime since your server can continue to work as usual and you will switch the drive in seconds. Data will "replicate" automatically. This is true if you have a hot spare RAID 5 system with 4 drives or more. If you have RAID 1 or a RAID 5 system with 3 or less drives it is likely you will have to take the system down. This is why you want to use SirV™ to monitor your system and e-mail you immediately when signs of wear start showing on a disk. Changing a drive before it fails is a good practive in a RAID 1 system, on a RAID 5 system if only one drive fails and you have at least 3 working drives it is not the end of the world if one drive actually breaks down completely. How ever you don't want to run a system on only three drives for too long - the is always a risk that more drives fail and you might risk losing data.

As far as filesystems go most do work great with CCM™. ZFS might cause some problems under certain circumstances so you might want to avoid that unless you really know what you are doing and why you do need ZFS. Ext4 is a good choice. It does journaling (like Reiserfs et al but most users agree that EXT4 is faster). Ext3 and Ext2 of course work as well, but Ext2 might be a less good choice since your data is not as secure if a problem would occur in your system. Ext2 should never be used without working batteries in the raid controllers and UPS configured to carefully take the server down in a controlled reboot before power fails.
If you are not using Linux of course all the standard Unix formats will work great for your UNIX server, UFS for example. Even NTFS (only recommended on windows boxes though and never on Linux!) of course do work if you have to run an OS from MS for some reason. That is what is so great about CCM™ - you basically can run it on any platform and it will still perform really great.

Remember that certain types of OS demand more RAM, if you do run MS or an Apple operating system you should add much more RAM than you would on a Unix or Linux box. Also on a windows box it might be a good idea to use a separate disk for the swap space (some times referred to  swap file system or simply just "swap").

Using swap alot will wear out your drives faster. If you have a system with LVM partitions spread across drives it is good practice never to use the same drives for data and swap. Keep swap to itself or disable it if you think you will never run out of RAM completely. A vanilla Linux system will return unused memory when another process needs allocated memory that is no more in use. This means it might look as if the used memory is growing and growing and growing. This is not a bad thing, this is Linux way of beeing prepared for super fast access to a previously used file. It can reside in RAM as long as something else does not need that piece of RAM more (if so the old data will be thrown out of RAM) - it doesn't slow down your system, rather on the contrary. Very often you need to access the same data and if we are talking about web servers almost all data is almost always requested several times so keeping all your web-data in RAM is the best thing your CCM server™ can do.

How much RAM?
As stated above: alot, as much as you can fit in there and afford. But atleast twice the amount used when everything is loaded and in use (ie surf through every web page at least once) + at least 30%.

RAM Summary: If you booted up your system, loaded every page once and your system uses 8GB you need 16GB + 30%. Round that up and put in at least 32GB (4x8GB to leave some memory slots free for upgrades in the future!) and you should be safe.

If you are running a diskless system or not using swap, everything needs to be loaded into RAM. Even if you are not planning on using more than, lets´s say 32GB, but your system supports up to 96GB - why not put 96GB in there right away? You don't want to run out of memory and many processes in a modern server can actually behave rather bad if there is absolutely no memory left.

Logging to a separate log server is a really good idea from a security point of view but also when runing diskless systems or servers with no swap can benefit from using a separate log server. If something would go completely wrong and your system would hang or crash (worst case scenario) - now at least you can access your logs from the separate log server and even better - no logs will be lost if you reset or reboot the server.

Nidavellir Mk II owners might have heard of the advanced remote management functions that are included and accessible in almost all Nidavellir MK II Firewalls and load balancers. The same kind of features can be added to your dedicated CCM™ servers and allow you to manage the server from a remote location. For example in the unlikely, but very irritating, scenario if your server just "hangs" (ie a crash or a kernel panic where the server didn't shut down nor did it stop but you can not login in and control it, these states can be caused by worn out hardware and are not that uncommon that people think, usually they only didn't realize that the server crashed and just reset or reboot it without investigating enough) - you can still just reboot it, even some bios functions might be accessible.

A crash should always be investigated. You have to find the cause. It could be as simple as a broken memory dimm that has to be replaced. If you do not do your job as a system administrator good enough and find the cause of a crash it will happen again and you will only a) give yourself more work and, most important: b) cause much problems for the end users. And for no reason at all actually. If the problem is resolved before the machine is put back in production you can make sure that there is always a minimum risk for trouble.


Summary
As described above you install the CCM servers, starting with the latest version first and go back and check for errors. Fix one error at a time and always try to emulate problems by rebooting servers etc and making sure that everything starts up as it should etc.

The network itself, ie switches, firewalls etc are nothing we are going to cover deeply here but basically it is a good idea to set up everything according to the netplan first and do a testinstall at one of the backends and a few frontends first. Make sure you can send traffic in and out of the firewalls and keep unwanted traffic out before you start all the installations.

...And of course, as always: contact our support if you run into any problems!