Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- The original patch was written by jeroen@linuxforge.net
- Benchmarks by graysky
- Three different machines running a generic x86-64 kernel and an otherwise identical kernel running with the optimized gcc options were tested using a make based endpoint.
- Conclusion:
- There are small but real speed increases using a make endpoint to running with this patch.
- Details:
- 1) Three test machines: Intel Xeon X3360, Intel i7-2620M, Intel Core i7-3660K.
- 2) All ran the make benchmark (linked below) 35 times while booted into a 'generic' kernel. Then all ran the same make benchmark 35 times after booting into an optimized kernel. Below are the optimizations chosen for each machine.
- 2a) X3360 = core2
- 2b) i7-2620M = corei7-avx
- 2c) i7-3660K = core-avx-i
- 3) Analyzed resulting distributions for statistical significance via ANOVA plots that clearly show statistically significant albeit small differences.
- Links to ANOVA plots:
- http://s19.postimage.org/68urcofzn/corei7_avx.png
- http://s19.postimage.org/ozwomuak3/core_avx_i.png
- http://s19.postimage.org/d0l6fj4z7/core2.png
- Discussion:
- 1) All the assumptions for ANOVA are met:
- *Data are normally distributed as show in the normal quantile plots.
- *The population variances are fairly equal (Levene and Barlett tests).
- 2) The ANOVA plots clearly show significance.
- *Pair-wise analysis by Tukey-Kramer shows significance at the 0.05 level for all CPUs compared.
- Below are the differences in median values:
- core2 +87.5 ms
- corei7-avx +79.7 ms
- core-avx-i +257.2 ms
- References:
- Bash script that controls the benchmark: https://github.com/graysky2/bin/blob/master/bench
- Log file generated by script: http://repo-ck.com/bench/compile_time_optimization.txt.gz
- ---
- --- linux-3.8/arch/x86/include/asm/module.h 2012-12-10 22:30:57.000000000 -0500
- +++ linux-3.8.mod/arch/x86/include/asm/module.h 2013-01-02 08:24:39.225359956 -0500
- @@ -17,6 +17,14 @@
- #define MODULE_PROC_FAMILY "586MMX "
- #elif defined CONFIG_MCORE2
- #define MODULE_PROC_FAMILY "CORE2 "
- +#elif defined CONFIG_MCOREI7
- +#define MODULE_PROC_FAMILY "COREI7 "
- +#elif defined CONFIG_MCOREI7AVX
- +#define MODULE_PROC_FAMILY "COREI7AVX "
- +#elif defined CONFIG_MCOREAVXI
- +#define MODULE_PROC_FAMILY "COREAVXI "
- +#elif defined CONFIG_MCOREAVX2
- +#define MODULE_PROC_FAMILY "COREAVX2 "
- #elif defined CONFIG_MATOM
- #define MODULE_PROC_FAMILY "ATOM "
- #elif defined CONFIG_M686
- @@ -35,6 +43,16 @@
- #define MODULE_PROC_FAMILY "K7 "
- #elif defined CONFIG_MK8
- #define MODULE_PROC_FAMILY "K8 "
- +#elif defined CONFIG_MK10
- +#define MODULE_PROC_FAMILY "K10 "
- +#elif defined CONFIG_MBARCELONA
- +#define MODULE_PROC_FAMILY "BARCELONA "
- +#elif defined CONFIG_MBOBCAT
- +#define MODULE_PROC_FAMILY "BOBCAT "
- +#elif defined CONFIG_MBULLDOZER
- +#define MODULE_PROC_FAMILY "BULLDOZER "
- +#elif defined CONFIG_MPILEDRIVER
- +#define MODULE_PROC_FAMILY "PILEDRIVER "
- #elif defined CONFIG_MELAN
- #define MODULE_PROC_FAMILY "ELAN "
- #elif defined CONFIG_MCRUSOE
- --- linux-3.8/arch/x86/Kconfig.cpu 2013-02-20 11:29:58.264481742 -0500
- +++ linux-3.8.mod/arch/x86/Kconfig.cpu 2013-02-20 11:32:57.613339867 -0500
- @@ -139,7 +139,7 @@
- config MK6
- - bool "K6/K6-II/K6-III"
- + bool "AMD K6/K6-II/K6-III"
- depends on X86_32
- ---help---
- Select this for an AMD K6-family processor. Enables use of
- @@ -147,7 +147,7 @@
- flags to GCC.
- config MK7
- - bool "Athlon/Duron/K7"
- + bool "AMD Athlon/Duron/K7"
- depends on X86_32
- ---help---
- Select this for an AMD Athlon K7-family processor. Enables use of
- @@ -155,12 +155,48 @@
- flags to GCC.
- config MK8
- - bool "Opteron/Athlon64/Hammer/K8"
- + bool "AMD Opteron/Athlon64/Hammer/K8"
- ---help---
- Select this for an AMD Opteron or Athlon64 Hammer-family processor.
- Enables use of some extended instructions, and passes appropriate
- optimization flags to GCC.
- +config MK10
- + bool "AMD 61xx/7x50/PhenomX3/X4/II/K10"
- + ---help---
- + Select this for an AMD 61xx Eight-Core Magny-Cours, Athlon X2 7x50,
- + Phenom X3/X4/II, Athlon II X2/X3/X4, or Turion II-family processor.
- + Enables use of some extended instructions, and passes appropriate
- + optimization flags to GCC.
- +
- +config MBARCELONA
- + bool "AMD Barcelona"
- + ---help---
- + Select this for AMD Barcelona and newer processors.
- +
- + Enables -march=barcelona
- +
- +config MBOBCAT
- + bool "AMD Bobcat"
- + ---help---
- + Select this for AMD Bobcat processors.
- +
- + Enables -march=btver1
- +
- +config MBULLDOZER
- + bool "AMD Bulldozer"
- + ---help---
- + Select this for AMD Bulldozer processors.
- +
- + Enables -march=bdver1
- +
- +config MPILEDRIVER
- + bool "AMD Piledriver"
- + ---help---
- + Select this for AMD Piledriver processors.
- +
- + Enables -march=bdver2
- +
- config MCRUSOE
- bool "Crusoe"
- depends on X86_32
- @@ -252,7 +288,7 @@
- in /proc/cpuinfo. Family 15 is an older Xeon, Family 6 a newer one.
- config MCORE2
- - bool "Core 2/newer Xeon"
- + bool "Intel Core 2"
- ---help---
- Select this for Intel Core 2 and newer Core 2 Xeons (Xeon 51xx and
- @@ -260,6 +296,41 @@
- family in /proc/cpuinfo. Newer ones have 6 and older ones 15
- (not a typo)
- + Enables -march=core2
- +
- +config MCOREI7
- + bool "Intel Core i7"
- + ---help---
- +
- + Select this for the Intel Nehalem platform. Intel Nehalem proecessors
- + include Core i3, i5, i7, Xeon: 34xx, 35xx, 55xx, 56xx, 75xx processors.
- +
- + Enables -march=corei7
- +
- +config MCOREI7AVX
- + bool "Intel Core 2nd Gen AVX"
- + ---help---
- +
- + Select this for 2nd Gen Core processors including Sandy Bridge.
- +
- + Enables -march=corei7-avx
- +
- +config MCOREAVXI
- + bool "Intel Core 3rd Gen AVX"
- + ---help---
- +
- + Select this for 3rd Gen Core processors including Ivy Bridge.
- +
- + Enables -march=core-avx-i
- +
- +config MCOREAVX2
- + bool "Intel Core AVX-2"
- + ---help---
- +
- + Select this for AVX-2 enabled processors including Haswell.
- +
- + Enables -march=corei7-avx-2
- +
- config MATOM
- bool "Intel Atom"
- ---help---
- @@ -300,7 +371,7 @@
- config X86_L1_CACHE_SHIFT
- int
- default "7" if MPENTIUM4 || MPSC
- - default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
- + default "6" if MK7 || MK8 || MK10 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MPENTIUMM || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
- default "4" if MELAN || M486 || MGEODEGX1
- default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
- @@ -331,11 +402,11 @@
- config X86_INTEL_USERCOPY
- def_bool y
- - depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2
- + depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MK10 || MBARCELONA || MEFFICEON || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2
- config X86_USE_PPRO_CHECKSUM
- def_bool y
- - depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM
- + depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MK10 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MATOM
- config X86_USE_3DNOW
- def_bool y
- @@ -363,17 +434,17 @@
- config X86_TSC
- def_bool y
- - depends on ((MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MATOM) && !X86_NUMAQ) || X86_64
- + depends on ((MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MK10 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MCOREI7 || MCOREI7-AVX || MATOM) && !X86_NUMAQ) || X86_64
- config X86_CMPXCHG64
- def_bool y
- - depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM
- + depends on X86_PAE || X86_64 || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM
- # this should be set for all -march=.. options where the compiler
- # generates cmov.
- config X86_CMOV
- def_bool y
- - depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
- + depends on (MK8 || MK10 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MK7 || MCORE2 || MCOREI7 || MCOREI7AVX || MCOREAVXI || MCOREAVX2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
- config X86_MINIMUM_CPU_FAMILY
- int
- --- linux-3.8/arch/x86/Makefile 2012-12-10 22:30:57.000000000 -0500
- +++ linux-3.8.mod/arch/x86/Makefile 2013-01-02 08:39:20.199840158 -0500
- @@ -58,10 +58,23 @@
- # FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
- cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8)
- + cflags-$(CONFIG_MK10) += $(call cc-option,-march=amdfam10)
- + cflags-$(CONFIG_MBARCELONA) += $(call cc-option,-march=barcelona)
- + cflags-$(CONFIG_MBOBCAT) += $(call cc-option,-march=btver1)
- + cflags-$(CONFIG_MBULLDOZER) += $(call cc-option,-march=bdver1)
- + cflags-$(CONFIG_MPILEDRIVER) += $(call cc-option,-march=bdver2)
- cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona)
- cflags-$(CONFIG_MCORE2) += \
- - $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
- + $(call cc-option,-march=core2,$(call cc-option,-mtune=core2))
- + cflags-$(CONFIG_MCOREI7) += \
- + $(call cc-option,-march=corei7,$(call cc-option,-mtune=corei7))
- + cflags-$(CONFIG_MCOREI7AVX) += \
- + $(call cc-option,-march=corei7-avx,$(call cc-option,-mtune=corei7-avx))
- + cflags-$(CONFIG_MCOREAVXI) += \
- + $(call cc-option,-march=core-avx-i,$(call cc-option,-mtune=core-avx-i))
- + cflags-$(CONFIG_MCOREAVX2) += \
- + $(call cc-option,-march=core-avx2,$(call cc-option,-mtune=core-avx2))
- cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom) \
- $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
- cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=generic)
- --- linux-3.8/arch/x86/Makefile_32.cpu 2012-12-10 22:30:57.000000000 -0500
- +++ linux-3.8.mod/arch/x86/Makefile_32.cpu 2013-01-02 08:41:23.554252806 -0500
- @@ -25,6 +25,11 @@
- # They make zero difference whatsosever to performance at this time.
- cflags-$(CONFIG_MK7) += -march=athlon
- cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8,-march=athlon)
- +cflags-$(CONFIG_MK10) += $(call cc-option,-march=amdfam10,-march=athlon)
- +cflags-$(CONFIG_MBARCELONA) += $(call cc-option,-march=barcelona,-march=athlon)
- +cflags-$(CONFIG_MBOBCAT) += $(call cc-option,-march=btver1,-march=athlon)
- +cflags-$(CONFIG_MBULLDOZER) += $(call cc-option,-march=bdver1,-march=athlon)
- +cflags-$(CONFIG_MPILEDRIVER) += $(call cc-option,-march=bdver2,-march=athlon)
- cflags-$(CONFIG_MCRUSOE) += -march=i686 $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0
- cflags-$(CONFIG_MEFFICEON) += -march=i686 $(call tune,pentium3) $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0
- cflags-$(CONFIG_MWINCHIPC6) += $(call cc-option,-march=winchip-c6,-march=i586)
- @@ -33,6 +38,10 @@
- cflags-$(CONFIG_MVIAC3_2) += $(call cc-option,-march=c3-2,-march=i686)
- cflags-$(CONFIG_MVIAC7) += -march=i686
- cflags-$(CONFIG_MCORE2) += -march=i686 $(call tune,core2)
- +cflags-$(CONFIG_MCOREI7) += -march=i686 $(call tune,corei7)
- +cflags-$(CONFIG_MCOREI7AVX) += -march=i686 $(call tune,corei7-avx)
- +cflags-$(CONFIG_MCOREAVXI) += -march=i686 $(call tune,core-avx-i)
- +cflags-$(CONFIG_MCOREAVX2) += -march=i686 $(call tune,core-avx-2)
- cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom,$(call cc-option,-march=core2,-march=i686)) \
- $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement