# Some Swift benchmarks

There has been some talk of Apple’s new Swift language being slow without the use of safety-compromising compiler optimizations (here and here, for example), so I decided to do some experiments myself. I’ll update this post as I perform more benchmarks.

I wrote an equivalent quicksort implementation in Swift and C. The implementation is nothing fancy, and exactly follows that on quicksort’s Wikipedia page. The test consists of sorting a reversed array of 10,000 consecutive integers.

#### Results

$xcrun swift -i -O3 -emit-executable quicksort.swift$ time ./quicksort

real	0m14.236s
user	0m14.218s
sys	0m0.012s
$xcrun swift -i -Ofast -emit-executable quicksort.swift$ time ./quicksort

real	0m0.076s
user	0m0.069s
sys	0m0.004s
$gcc -std=c99 -O3 -o quicksort quicksort.c$ time ./quicksort

real	0m0.042s
user	0m0.039s
sys	0m0.002s

Language Time (s) Ratio
Swift (-O3) 14.236 339.
Swift (-Ofast) 0.076 1.81
C (-O3) 0.042 1.00

So it looks like there is substance to these claims. Consistently using -Ofast doesn’t seem like a viable alternative either, since it removes many of the safety features and therefore undermines the motivation for using the language in the first place. Consider the following, for example:

Output with -O3:

fatal error: Array index out of range


Output with -Ofast:

0


The -Ofast-compiled program is completely silent about the fact that the array was indexed out of bounds.

We can see where the speed impediment comes from by looking at the generated assembly of a Swift program, and comparing it to that of an equivalent C program (using the -S switch). Consider the following simple program:

#### C

The generated assembly (using -O3):

#### Swift

	.section	__TEXT,__text,regular,pure_instructions
.globl	__TF6simple9incrementFSiSi
.align	4, 0x90
__TF6simple9incrementFSiSi:
pushq	%rbp
movq	%rsp, %rbp
incq	%rdi
jo	LBB0_2
movq	%rdi, %rax
popq	%rbp
retq
LBB0_2:
ud2

.globl	_main
.align	4, 0x90
_main:
.cfi_startproc
pushq	%rbp
Ltmp3:
.cfi_def_cfa_offset 16
Ltmp4:
.cfi_offset %rbp, -16
movq	%rsp, %rbp
Ltmp5:
.cfi_def_cfa_register %rbp
pushq	%r14
pushq	%rbx
Ltmp6:
.cfi_offset %rbx, -32
Ltmp7:
.cfi_offset %r14, -24
movq	%rsi, %r14
movl	%edi, %ebx
callq	__TFSsa6C_ARGCVSs5Int32
movl	%ebx, (%rax)
callq	__TFSsa6C_ARGVGVSs13UnsafePointerVSs7CString_
movq	%r14, (%rax)
xorl	%eax, %eax
popq	%rbx
popq	%r14
popq	%rbp
retq
.cfi_endproc

L_OBJC_IMAGE_INFO:
.long	0
.long	0

.subsections_via_symbols


#### C

	.section	__TEXT,__text,regular,pure_instructions
.globl	_main
.align	4, 0x90
_main:                                  ## @main
.cfi_startproc
## BB#0:
pushq	%rbp
Ltmp2:
.cfi_def_cfa_offset 16
Ltmp3:
.cfi_offset %rbp, -16
movq	%rsp, %rbp
Ltmp4:
.cfi_def_cfa_register %rbp
xorl	%eax, %eax
popq	%rbp
retq
.cfi_endproc

.subsections_via_symbols


No wonder the Swift variant is slower; that’s a lot of code to do something so simple. Personally, I have no problem with Swift being noticeably slower than C – in fact I would be surprised if it weren’t – I do, however, have a problem with it being orders of magnitude slower. In fairness, Swift is still in its early stages, and I’m confident that this issue will be resolved in (hopefully the near) future.