By now, you probably heard about the Meltdown and Spectre family of
vulnerabilities, and probably are quite busy dealing with the aftermath
(evaluating patches, applying patches, reversing patches). Also, you are
probably wandering how on earth you will get back the performance that your
datacenters lost due to the patches. Here we will explore a few ways to squeeze
more performance out of your existing infrastructure to regain the performance
you lost to Meltdown/Spectre, and even get a little bit more to reduce the need
to buy more servers. This advice applies to Datacenters big or small. And by
the way, many of these tips will make you look great in front of your CEO, CIO
and CFO!
Brief
recap of Meltdown and Spectre.
In order to understand why Meltdown and Spectre are important, reduce
performance, and how to regain some of that performance, is important to do a
brief, focused recap.
Meltdown (CVE-2017-5754) and Spectre (CVE-2017-5715, CVE-2017-5753) are an
industry-wide “family” of vulnerabilities that affects to varying degrees many
processor architectures from many manufacturers. As far back as 1996 the possibility
of exploits like these (on X86-32/64) was pointed out [1]. But it was not until
2016 that a practical way to do it began to emerge [2].
Meltdown affects all Intel Processors with Out-of-Order-Execution (OOE) and,
more importantly, Speculative-Execution, perhaps going back to the Original
PentiumPro, and all Atom processors made after 2013 (the original Atoms were
In-Order-Execution). AMD processors are immune [3], and Via (remember Via?) has
remained silent. Meltdown also affects other µarchitectures, like several ARM
processors, including the up-and-coming Cortex-A75 (intended for datacenter
use), as well as many others used in cellphones and appliances [5], also IBM’s
POWER7+, 8 and 9 are affected [4]. But this paper is not concerned with other
architectures.
Spectre is an industry-wide “family” of vulnerabilities with two variants (so
far), that affects pretty much all microprocessors with OOE in one or both
forms. X86-32/64 processors from Intel, AMD [3], and perhaps Via. Other
architectures are also affected, like ARM [5], IBM Power7+, 8 and 9 [4],
SPARCv9 [6]. But, again, this paper is not concerned with other architectures.
You need to patch. No ifs (the guys telling you to use firewalls and
evaluate any loss of performance vis a vis security requirements are lawyers,
not engineers).
Do not pay attention to people saying to evaluate risk versus performance in
order to decide if to apply the patches or not, nor to appliance sellers saying
that no patches are coming because “we only run our own code”. Meltdown and
Spectre can be combined with other vulnerabilities to inject code, worm like [7][8].
Also, there are known cases of tampering with the build process or update
channels of software makers, to inject contaminated (and signed) code [9] [10],
so, even if you only use software provided by trusted entities, or made by the
maker of the appliance, you are still at risk. Patch, and pressure your
suppliers to provide patches! You do not want to become collateral damage of a
war between two nation states just because you happen to use a specific
appliance. Let Stuxnet be a warning.
Of course, right now patches, specially microcode patches, seem to be unstable.
Is fair to evaluate the patches, or give them a little time to settle and
mature. But eventually you need to patch. Do not, under any circumstances,
declare that a machine which has patches available will not be patched.
All vulnerabilities exploit OOE using variable mechanisms to read or guess
kernel data that should not be readable. This data may include sensitive
passwords and cryptographic keys. Proof of Concept code does exist, but it has
not been weaponized as of this writing. Mitigations imply certain measures that
significantly reduce performance. The more your workload calls the kernel, the
more your performance is affected. But there are some modern features and instructions
that can be used to mitigate this performance hit, so, if your processor is
older than say, Haswell, the more performance you lose. [11]
In order to fix Meltdown, one has to resort to a technique called Kernel Page
Table Isolation. Problem is, once you separate the page tables of the Kernel
and the apps, every time you switch from one mode to another, you flush a small
(but critical) cache called the Translation Lookaside Buffer (or TLB). There are
a feature and an instruction in Intel CPUs that reduce the need to flush all
the TLB every time one switches from user mode to kernel mode (and vice-versa),
the feature is called PCID (Process Context ID) and the instruction is called INVPCID
(invalidate PCID). The first processor to have both was Haswell, and
both are needed for patches that have a smaller performance hit to work,
otherwise, you will only get the patches that reduce your performance a lot.
In order to fix Spectre variant two, one has to fudge with Branch
prediction and another small (but critical) cache called the Branch Target Buffer
(BTB) [11]. In order to minimize the impact on the BTB, one needs certain
instructions called IBRS ("indirect branch restricted speculation"),
STIBP ("single thread indirect branch predictors”) and IBPB
("indirect branch prediction barrier") enabled by microcode. The
older the server/processor combo in question is, the less likely it is to
receive a microcode update via firmware or OS. On machines that do not get a
microcode update, other techniques, like Google’s “retpoline” [12] may be used.
NOTE: Spectre Variant one is mitigated by reducing the resolution in certain OS
and App timers, and is done on an app by app basis.
Now you know roughly what Meltdown and Spectre are, why is imperative to patch
EVERY SINGLE INSTANCE AFFECTED and why you get slowdowns.
But you need to regain your lost performance now, and you need to
regain even more performance to minimize (but not eliminate) the purchase of
new servers until 2022. So, we move on to many ways to squeeze more performance
out of your existing server fleet.
Beware. Hardware with these bugs really fixed
will NOT arrive until 2022 (at least).
Granted, you may think you can get out of this by buying new Hardware, and
redeploying your VMs with less oversubscription. But the problem with that
approach is that you will end up with a lot of new Hardware “patched at
the factory”, instead of being truly fixed. And you will have to live with that
Hardware for five (or more) years.
Why do I say Hardware with the bugs really fixed will not arrive until 2022? Quite
simply stated, designing a new microprocessor generation takes about 4 years,
and then a little bit more time to manufacture those processors in volume and
put them inside new server generations. All processor makers were notified of
Meltdown and Spectre in July 2017. So, one has to wait until about 2022 (being
optimistic) for any hope of buying a server with a processor designed from the
ground up without these vulnerabilities, as opposed to one “Patched at the
Factory”. And pretty much the processor inside every server you buy from now
until sometime in 2022 will be “Patched at the Factory”. And you, dear reader,
will have to live with those servers for 5 years (or more). Do not take my word
for it, Linus Torvalds (creator and benevolent dictator of Linux) has this to
say about the next crop of “patched at the factory” processors: « As it is, the
patches are COMPLETE AND UTTER GARBAGE.» [13]
Why is this important? Well, as we saw previously, Meltdown and Spectre
are just the tip of the iceberg of a class of vulnerabilities related to
“Speculative Execution”. As we speak, researchers (good, bad, white hats, black
hats, friendly nation states and unfriendly nation states [depending of your
perspective]) are exploring that rabbit hole for even more security
implications of this. It is much, much better to get a server with a processor
redesigned to not have the flaw in the first place, than to have one
that has the flaws patched at the factory, but with the lurking threat of
discovering yet another flaw with yet another round of patches.
Of course, is a fiction to believe that one can completely eliminate all server
purchases from here on to 2022. But servers you buy say, two years in the
future, will not only be patched at the factory, but will be benchmarked with
those patches already installed, making planning easier, and be priced
accordingly to said benchmarks.
Next
are a few tips to squeeze more performance out of your current infrastructure,
minimizing (but not eliminating) cash outlays and server purchases as much as
possible. Oh, and as we said before, these tips will make you look great in
front of your CEO, CIO and CFO!
Recommendation #1: Move Workloads from physical
machines to virtual ones
I know what you are thinking. This first advise seems completely counter
intuitive! Didn’t putting a workload on a VM incurs a performance penalty?
And doesn’t that performance penalty will come in addition to the performance
penalty incurred from patching Meltdown and Spectre? The answer to both
questions is yes. And still, I stand by this advice.
Yes, there is a performance penalty for going from physical to virtual servers.
In the old times (2005), that penalty was between 5 and 30% depending on
workload [14], but, with more than 10 years of optimizations, hypervisors have
increased their performance a lot. Nowadays, the performance penalty is much
lower [15][16].
Now, there are many system administration advantages in moving a workload from
physical servers to virtual ones, and those have to be considered as a bonus.
But in our case, we are interested in two effects:
a.) Unlocking the idle capacity inside those servers. If a server is using only
50% of its CPU capacity, and we virtualize it, even after a 30% performance
penalty, we still have 20% of the CPU capacity available to other uses.
b.) Increasing the size of the pool of machines available for virtualization.
This makes it more likely that we will find out a machine tailored to a
workload, say, a Haswell or newer machine for a Kernel-intensive workload, and
a pre-Haswell machine for a workload that does not call the Kernel too much
(remember, the more your application calls the Kernel, the more a performance
hit it receives).
At this point, most likely, your organization is using virtualization,
hopefully Bare-Metal-Virtualization, and if not, I urge you to start using it.
But I am quite certain that there are still some workloads on physical servers.
There are valid technical reasons to keep workloads on physical servers. In the
past there were many reasons, but nowadays, there are very few. The reason many
workloads that could be virtualized are still on physical servers is corporate
inertia (you will see this phrase a lot from here on), plain and
simple. We all know that stubborn sysadmin, or the person that does not keep
current and is still thinking about that 2007 paper [14], or the manager that
is too conservative for his (and the organization’s) own good. Well, now you
have the impetus for change.
Recommendation #2: Move some of your
workloads from other Hypervisors to KVM (Kernel-based Virtual Machine)
There are differences in performance between hypervisors. And while is true
that over the years those differences have reduced as hypervisor code has
matured, and every hypervisor has adopted and adapted the best techniques from
one another, it is still true that KVM, due to the way it was implemented, has an
edge in terms of performance for most workloads.
So, move as many of your VMs and Hosts as possible from VMware, Xen, and
Hyper-V to KVM. Do not get me wrong, all those four Hypervisors are great Hypervisors,
with many strong points, with VMware being the “Gold Standard”, but if your
main concern is squeezing as much performance as possible from your current
infrastructure, you need to go to KVM, warts and all.
Chances are your organization already has KVM. For instance, most
implementations of OpenStack use KVM as their hypervisor, and RedHat and many
other Linux distros have it integrated. And if you are not using KVM, I urge
you to integrate it to your environment (perhaps at the expense of some other more
expensive hypervisor). Is not that expensive if you use Ubuntu, CentOS or
RougeWave (for example).
If you have been a good sysadmin, you created all your VMs using OVF 2.0 (and
if not, you better have a very good technical reason) therefore,
moving them to another hypervisor is relatively simple (unless you have been
using some proprietary management and instrumentation functions and APIs of
your hypervisor). If you have not been using OVF 2.0, you should start in
earnest. And if you have been using proprietary APIs and management functions,
you should really look at cross platform solutions.
Of course, do this ONLY if it makes financial sense. If you are a huge
Microsoft shop and you are getting Hyper-V for free due to your licensing
terms, it makes no sense to bring in a paid hypervisor, no matter how much
performance you regain. Also, if you have many Oracle databases, the money
Oracle charges for running their database in other Hypervisors (as opposed to
their own Xen derivate) makes it untenable to move those workloads to KVM. But,
on the other hand, if you are paying high costs for licensing some other
hypervisors due to corporate inertia, perhaps this advice will not only get you
more performance, but also reduce your licensing costs to boot!
So, if you can, expand your KVM pool of VMs and hosts, to recoup as much
performance as you can.
Recommendation #3: Partition your server
pools wisely
Most hypervisors allow you to move VMs from a physical host of one processor
generation to a host of a different processor generation as needed. And if you
have been a good sysadmin, you have enabled this feature. The way hypervisors
do this is to make all the processors in the host pool to report themselves to
the VMs as belonging to the oldest generation available in the pool, and hiding
any capabilities not supported by that generation.
If you recall our analysis of Meltdown and Spectre, you realize that in
order to get security at the Hardware level, one needs processors with the PCID
feature and the INVPCID, IBRS, STIBP and IBPB instructions. The first two are
present in Haswell and higher processors, while the other three come with a
microcode update. Which means that you need to partition your fleet in at least
three groups: Haswell or higher with microcode update, Haswell or higher with
no microcode update, and lower than Haswell.
If you do not do it like that, then all the microcode updates will be for
naught, as the new capacities will be hidden from the VMs, which will resort to
use the less efficient patches.
Note1: Considering that Haswell was announced in 2013, is highly unlikely that
anything older than Haswell will receive a microcode update to get IBRS, STIBP
and IBPB.
Note 2: This discussion intentionally leaves out AMD processors, as those are
“somewhat less vulnerable” to Meltdown and Spectre and handle thing in a
slightly different fashion, but the advice of a smart split of your AMD server
pool (between Zen and various generations of Buldozer) still stands.
And now you see why Recommendation #1 was not such a contradiction. By
expanding the pool of physical servers available for virtualization, you make
it easier on yourself to partition your pools along those lines.
Recommendation #4: Deploy your workloads in
the proper pools
This one should be evident by now. For example, do you have a workload that
calls the kernel a lot? Deploy on your pool of machines with Haswell (or
higher) and microcode updates. Have a workload that invokes the kernel very little?
Deploy on lower than Haswell, no microcode.
Recommendation #5: If you have workloads that
can be moved from VMs to Containers, Just Do It!
Containers are the new kids in town. As such, many conservative Sysadmins and
Managers distrust them. But many applications are now ready to move to
containers, and are supported by their developers and commercial entities too.
As you may know, containers have even less overhead than bare-metal
hypervisors. So, in our context, containers allow us to recover even more
performance from our infrastructure.
If any of your workloads has a container ready implementation, with adequate
support, move it now. If the only reason for not doing it was corporate
inertia, now you have the impetus needed to “make it so”.
Recommendation #6: Be on the lookout for
inefficient workloads
Here is a personal anecdote: In one of my previous works as a sysadmin (more
like the senior manager of the sysadmins) a friend programed a critical ETL
application in Java on Windows, to be deployed in Java on Linux. The
application worked on a Windows server, consuming 40% of CPU. The guy went on
vacation before launch. As soon as it was moved to Linux, it jumped to 100%.
Since it was never tested in Linux, and it was consuming 100% CPU, I declared
that a showstopper, and refused to pass it into production. But the CEO
personally said that the application had to go online that day (Dec 23, 2003).
I questioned my friend’s teammates, and it turns out that the application
checked a directory for a file to process, and if the file was not there it checked
again immediately, and if the file was still not there, it checked again, and
if…. You get the drift. I instructed his teammates to put a timer between
checks. Answer: We do not know how to do that in Java. So, I instructed them to
break the loop and execute only once, and I put the application in cron to
execute every 5 seconds (I had to roll my sleeves for this, as well as
developing a watchdog for the application, which that team “conveniently”
forgot). Result? Processor usage in Linux measured in 5%, application stable
for 18 months, until it was replaced.
There are many applications in this world programmed in an inefficient manner,
calling the kernel excessively (among other sins). Sometimes, the sin is yours,
my dear admin colleague (a workload provisioned in a VM with less memory than
needed, forcing it to use the virtual swapfile a lot, for instance, trashing
the VM, and the SAN to boot).
If you spot some of those among your workloads, have a word with the
developer (be it an individual or group in your organization, or a company, or
a customer) in order to reach a more efficient implementation.
Recommendation #7: If you are an ISP and are
using a private cloud for your workloads, and a public one to sell, unify them
I know many ISPs in LatAm, and many of them have two (or more) clouds, one (or
more) to sell, and one (or more) for their internal workloads. And more often
than not, those clouds use the same server models and the same technology and
SW. The reason for the complete separation being merely administrative and
cultural (corporate inertia).
If you unify those clouds, you achieve many benefits:
a.)
You expand your
pool of machines. And as we saw in Recommendation #3, having a bigger pool
means an easier time partitioning your servers in a smart way.
b.)
You get access to
idle performance in one of your clouds that you may need in the other (in one
of the ISPs I know, the internal Openstack Cloud was in full usage from day 1,
because it was designed that way, while the customer facing cloud was
sub-utilized, because it was bought with the full capacity of blade servers
from day one, even though the ISP had to attract customers to it over the
course of a year).
c.)
You transmit a
powerful message to your customers: «the cloud I offer you is so good that we thrust
it with our own workloads. We will support it like it was ours, because our
workloads are running on it. If there is a glitch, we go down with you. Your
pain is our pain. In the words of Microsoft: “We eat our own dog food”». That inspires
more confidence in a prospective customer than: «Our cloud is very secure,
stable, and technologically advanced, but, just in case, we run our workloads in
a completely separate cloud from the one we sell to you».
Recommendation #8: If you are a normal Company,
use public clouds for some of your workloads
If your company is not selling a public cloud, it should probably use one. Take
a close look at your workloads, see which ones can be moved to public clouds
without much security implications, or costly redesign, and move them. This
will free up resources in your datacenters for your more sensitive workloads.
Let the public cloud providers be the ones who buy more “tainted” servers for
the time being. They can get more favorable terms with server makers, terms we
can only dream about.
Recommendation #9: Upgrade the processors in
your servers
If, after all this, you still need more performance, you could consider
upgrading the processors inside your existing servers. Yes, upgrading the
processor is possible on servers, and not only reserved for Desktops and
Workstations.
It may seem ironic that I recommend buying new processors, even knowing
that those processors still have the vulnerabilities, and many of them will not
have the features needed to reduce the performance hit. But hey! This will be
MUCH less expensive than buying new servers. New servers with the
vulnerabilities still in them that will stick around in your datacenter for 5
years (or more). But, and this is very important, look at the TCO of doing this
before you decide. You have to take into account parts, labor, disruption,
power and cooling (newer servers tend to be more efficient), remaining useful
life of the upgraded server versus a new one and many other factors. Develop a
solid business case analysis before doing it.
Many manufacturers allow you to upgrade your server processors in the field,
and they publish guidelines to do so. Check [17] for an example.
Some tips:
a.)
If the equipment
is still under warranty, do not do it. Wait until the warranty is over. Is not
worth the hassle.
b.)
If the equipment
is still supported, call the manufacturer and work with them. They will be
happy to help. And this way you ensure that you do not lose support for the
equipment due to unauthorized work performed.
c.)
If the equipment is
in end-of-sale, end-of-life, end-of-support, still contact the manufacturer. If
kindly asked, the support personnel can at least orient you as to which parts
work and which parts do not, and will point you to the right documents and
firmware updates.
d.)
Update every
single firmware in the machine before upgrading the processor.
e.)
If the machine has
a vacant socket, populate it, and redistribute/complete the DIMMS accordingly.
f.)
If you administer
a small-scale datacenter, upgrade on a case by case basis.
g.)
If you administer
a mid-scale datacenter, upgrade by batches. Upgrade all the spare parts and
development and testing servers. Then move the upgraded parts to production,
upgrade the parts you just removed and lather, rinse, repeat.
h.)
If you administer a
large-scale datacenter, you know better than anyone how to handle things like
this, and do not need my advice. But I need yours. I would use the method in
point g.), but also, upgrade whenever I need to repair damaged boards, as part
of the repair process. Let me know what you think about that.
i.)
If at all
possible, upgrade to a next generation processor. Many manufacturers have
boards that support processors from two generations. Even if you do not make
the cut to a Haswell processor with firmware updates, at least new processors
are faster per clock cycle, and more energy efficient, which reduces power and
cooling requirements.
j.)
Involve the
manufacturer of the server. This repetition was on purpose.
Recommendation #10: Consider AMD servers
Let’s face it, it is impossible to go on for four or more years without buying
more servers. But when buying servers, many organizations tend to compare only
Intel based servers, among different manufacturers. And, with the arrival of
Zen based processors from AMD, that is a mistake, as AMD became competitive
again.
The new processors based on AMD’s Zen architecture are immune to Meltdown, and
somewhat less susceptible to Spectre [3]. AMD processors and servers also tend
to be less expensive, and come with many goodies, like a higher maximum number
of cores per socket, more PCI-e lanes and more memory bandwidth. Granted, they
also have some drawbacks (mainly in the way they share their caches among the
cores, and raw IPC).
Do not get me wrong, Intel based servers are great, and Meltdown/Spectre is an
industry-wide problem. But from a security standpoint, and in general, it may
behoove you to include AMD alongside with Intel in any server purchase process
you may initiate in the future. Simply stated, tell server vendors that if they
do not have both Intel and AMD in their Lineup, they are not invited to your
future RFPs.
As you can see, the recommendations from 1 to 8 aim to recoup some performance
hidden in your current infrastructure, and allow you to have a more favorable
ratio of VMs per Physical Host, as well as deploying the workloads in the most
adequate server pools. The last two recommendations (9 and 10) aim at helping
you minimize the impact that additional horsepower will have in your budget and
security until 2022. I really hope this info and this set of tips and tricks
has been useful to you, please let me know whether that’s the case. Questions
or comments are welcome.
More Resources:
[1]
"The Intel 80x86 Processor Architecture: Pitfalls for Secure Systems"
May 1995. IEEE Symposium on Security and Privacy
[2]
https://www.wired.com/story/meltdown-spectre-bug-collision-intel-chip-flaw-discovery/
[3]
https://www.amd.com/en/corporate/speculative-execution
[4]
https://www.ibm.com/blogs/psirt/potential-impact-processors-power-family/
[5]
https://developer.arm.com/support/security-update
[6]
https://www.theregister.co.uk/2018/01/16/oracle_quarterly_patches_jan_2018/
[7]
https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Worm:Win32/Conficker.B
[8]
https://www.symantec.com/connect/blogs/linux-worm-targeting-hidden-devices
[9]
http://blog.talosintelligence.com/2017/09/avast-distributes-malware.html
[10]
https://www.welivesecurity.com/2017/07/04/analysis-of-telebots-cunning-backdoor/
[11]
https://arstechnica.com/gadgets/2018/01/heres-how-and-why-the-spectre-and-meltdown-patches-will-hurt-performance/
[12]
https://support.google.com/faqs/answer/7625886
[13]
https://lkml.org/lkml/2018/1/21/192
[14]
https://www.vmware.com/pdf/hypervisor_performance.pdf
[15]
W. Felter, A.
Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of
virtual machines and Linux containers,”
In
Proc. 2015 IEEE Int. Symp. Perform. Anal. Syst. Software (ISPASS
2015).Philadelphia, PA, USA: IEEE Press, 29-31 Mar. 2015, pp. 171–172.
[16]
K.-T. Seo, H.-S.
Hwang, I.-Y. Moon, O.-Y. Kwon, and B.-J. Kim, “Performance comparison analysis
of Linux container and virtual
machine
for building Cloud,” Adv. Sci. Technol. Lett. , vol. 66, pp. 105–111, Dec. 2014
[17]
https://support.hpe.com/hpsc/doc/public/display?docId=c03911173