Accelerated receive flow steering, 523
Accelerators in USE method, 49
accept
system calls, 95
Access timestamps, 371
ACK detection in TCP, 512
Actions in bpftrace
, 769
Active listening in three-way handshakes, 511
Active pages in page caches, 318
Ad hoc checklist method, 43–44
Adaptive mutex locks, 198
Adaptive Replacement Cache (ARC), 381
Address space, 304
guests, 603
kernel, 90
processes, 95, 99–102, 319–322
Address space layout randomization (ASLR), 723
Advanced Format for magnetic rotational disks, 437
AF_NETLINK address family, 145–146
Agents
product monitoring, 79
AKS (Azure Kubernetes Service), 586
Alerts, 8
Algorithms
caching, 36
congestion control, 115, 118, 513–514
Allocation groups in XFS, 380
Allocators
memory, 309
multithreaded applications, 353
process virtual address space, 320–321
Amazon EKS (Elastic Kubernetes Service), 586
Amdahl’s Law of Scalability, 64–65
Analysis
benchmarking, 644–646, 665–666
latency, 56–57, 384–386, 454–455
Analysis step in scientific method, 44–45
Analysis strategy in case study, 784
annotate
subcommand for perf
, 673
Anonymous memory, 304
Anti-methods
blame-someone-else, 43
streetlight, 42
Apdex (application performance index), 174
Application calls, tuning, 415–416
Application instrumentation in off-CPU analysis, 189
Application internals, 213
Application layer, file system latency in, 384
Application performance index (Apdex), 174
Applications, 171
bpftrace
for, 765
common case optimization, 174
internals, 213
latency documentation, 385
methodology. See Applications methodology
missing symbols, 214
observability, 174
observability tools. See Applications observability tools
performance techniques. See Applications performance techniques
programming languages. See Applications programming languages
Applications methodology
distributed tracing, 199
lock analysis, 198
static performance tuning, 198–199
syscall analysis, 192
thread state analysis, 193–197
USE method, 193
Applications observability tools
Applications performance techniques
buffers, 177
caching, 176
concurrency and parallelism, 177–181
I/O size selection, 176
non-blocking I/O, 181
Performance Mantras, 182
polling, 177
Applications programming languages, 182–183
virtual machines, 185
Appropriateness level in methodologies, 28–29
ARC (Adaptive Replacement Cache), 381
Architecture
CPUs. See CPUs architecture
disks. See Disks architecture
file systems. See File systems architecture
memory. See Memory architecture
networks. See Networks architecture
archive
subcommand for perf
, 673
arcstat.pl
tool, 410
arg
variables for bpftrace
, 778
Arguments
kprobes, 152
networks, 507
uprobes, 154
Arithmetic mean, 74
Arrival process in queueing systems, 67
ASG (auto scaling group)
capacity planning, 72
ASLR (address space layout randomization), 723
Associativity in caches, 234
Asynchronous disk I/O, 434–435
Asynchronous interrupts, 96–97
Asynchronous writes, 366
atop
tool, 285
Auto scaling group (ASG)
capacity planning, 72
available_filter_functions file, 710
Available swap, 309
available_tracers file, 710
avg
function, 780
await
metric, 461
Axes
scalability tests, 62
Azure Kubernetes Service (AKS), 586
Back-ends in instruction pipeline, 224
Background color in flame graphs, 291
Backlogs in network connections, 507, 519–520, 556–557, 569
Bad paging, 305
Balloon drivers, 597
Bandwidth
disks, 424
interconnects, 237
Bare-metal hypervisors, 587
Baseline statistics, 59
BATCH scheduling policy, 243
BBR (Bottleneck Bandwidth and RTT) algorithm, 118, 513
bcache technology, 117
BCC (BPF Compiler Collection), 12
vs. bpftrace
, 760
disks, 450
installing, 754
multi-purpose tools, 757
multi-tool example, 759
networks, 526
slow disks case study, 17
system-wide tracing, 136
bcc-tools tool package, 132
BEGIN
probes in bpftrace
, 774
bench
subcommand for perf
, 673
Benchmarketing, 642
capacity planning, 70
CPUs, 254
exercises, 668
memory, 328
micro-benchmarking. See Micro-benchmarking
replay, 654
specials, 650
SysBench system, 294
Benchmarking methodology
custom benchmarks, 662
overview, 656
USE method, 661
workload characterization, 662
Berkeley Packet Filter (BPF), 751–752
BCC compiler. See BCC (BPF Compiler Collection)
bpftrace
. See bpftrace
tool
extended. See Extended BPF
iterator, 562
JIT compiler, 117
kernels, 92
OS virtualization tracing, 620, 624–625, 629
program, 90
Berkeley Software Distribution (BSD), 113
BFQ (Budget Fair Queueing) I/O schedulers, 119, 449
Big kernel lock (BKL) performance bottleneck, 116
Billing in cloud computing, 584
Bimodal performance, 76
Binary executable files, 183
Binary translations in hardware virtualization, 588, 590
Binding
NUMA, 353
bioerr
tool, 487
biolatency
tool
biopattern
tool, 487
BIOS, tuning, 299
biosnoop
tool
BCC, 755
event tracing, 58
hardware virtualization, 604–605
queued time, 472
system-wide tracing, 136
biotop
tool
BCC, 755
Bit width in CPUs, 229
bitesize
tool
BCC, 755
perf-tools, 743
blame
command, 120
Blame-someone-else anti-method, 43
Blanco, Brenden, 753
Blind faith benchmarking, 645
blk tracer, 708
blkreplay
tool, 493
blktrace
tool
action filtering, 478
action identifiers, 477
description, 116
RWBS description, 477
visualizations, 479
Block-based file systems, 375–376
Block caches in disk I/O, 430
Block device interface, 109–110, 447
Block I/O state in delay accounting, 145
Block I/O times for disks, 427–428, 472
Block interleaving, 378
Block size
defined, 360
FFS, 378
Block stores in cloud computing, 584
Blue-green cloud computing deployments, 3–4
Bonnie and Bonnie++ benchmarking tools
Boolean expressions in bpftrace
, 775–776
Boot options, security, 298–299
Boot-time tracing, 119
Borkmann, Daniel, 121
Borrowed virtual time (BVT) schedulers, 595
Bottleneck Bandwidth and RTT (BBR) algorithm, 118, 513
Bottlenecks
complexity, 6
defined, 22
USE method, 47–50, 245, 324, 450–451
BPF. See Berkeley Packet Filter (BPF)
application internals, 213
block I/O events, 625, 658–659
description, 282
disk I/O errors, 483
event sources, 558
file system internals, 408
hardware virtualization, 602
installing, 762
malloc()
bytes flame graph, 346
one-liners for CPUs, 283, 803–804
one-liners for disks, 479–480, 806–807
one-liners for file systems, 402–403, 805–806
one-liners for memory, 343–344, 804–805
one-liners for networks, 550–552, 807–808
package contents, 132
packet inspection, 526
page fault flame graphs, 346
programming. See bpftrace
tool programming
references, 782
stacks viewing, 450
system-wide tracing, 136
tracepoints, 149
user allocation stacks, 345
bpftrace
tool programming
actions, 769
comments, 767
documentation, 781
example, 766
filters, 769
Hello, World! program, 770
probe arguments, 775
probe format, 768
program structure, 767
BQL (Byte Queue Limits)
driver queues, 524
tuning, 571
Branch prediction in instruction pipeline, 224
Breakpoints in perf
, 680
brk system calls, 95
brkstack
tool, 348
Broadcast network messages, 503
BSD (Berkeley Software Distribution), 113
btrfs file system, 381–382, 399
btrfsdist
tool, 755
btrfsslower
tool, 755
btt
tool, 478
Buckets
hash tables, 180
Buddy allocators, 317
Budget Fair Queueing (BFQ) I/O schedulers, 119, 449
buf function, 778
Bufferbloat, 507
Buffers
applications, 177
networks, 507
ring, 522
bufgrow
tool, 409
Bug database systems
applications, 172
buildid-cache
subcommand for perf
, 673
Built-in bpftrace
variables, 770, 777–778
Bursting in cloud computing, 584, 614–615
BVT (borrowed virtual time) schedulers, 595
Bypass, kernel, 94
Byte Queue Limits (BQL)
driver queues, 524
tuning, 571
Bytecode, 185
C, C++
compiled languages, 183
symbols, 214
stacks, 215
C-states in CPUs, 231
c2c
subcommand for perf
, 673, 702
Cache Allocation Technology (CAT), 118, 596
Cache miss rate, 36
Cache warmth, 222
cachegrind tool, 135
Caches and caching
applications, 176
associativity, 234
cache line size, 234
CPUs, hardware virtualization, 596
CPUs, OS virtualization, 615–616
CPUs, vs. GPUs, 240
defined, 23
dentry, 375
disks, I/O, 430
disks, on-disk, 437
disks, tuning, 456
file systems, flushing, 414
file systems, OS virtualization, 613
file systems, overview, 361–363
file systems, tuning, 389, 414–416
file systems, usage, 309
inode, 375
micro-benchmarking test, 390
perf
events, 680
RAID, 445
tuning, 60
write-back, 365
cachestat
tool
memory, 348
perf-tools, 743
slow disks case study, 17
Canary testing, 3
Capacity-based utilization, 34
Capacity of file systems, 371
Capacity planning
benchmarking for, 642
defined, 4
micro-benchmarking, 70
overview, 69
resource analysis, 38
CAPI (Coherent Accelerator Processor Interface), 236
Carrier sense multiple access with collision detection (CSMA/CD) algorithm, 516
CAS (column address strobe) latency, 311
Cascading failures, 5
Case studies
analysis strategy, 784
conclusion, 792
references, 793
Casual benchmarking, 645
CAT (Cache Allocation Technology), 118, 596
cat
function, 779
CAT (Intel Cache Allocation Technology), 118, 596
CFQ (completely fair queueing), 115, 449
CFS (completely fair scheduler), 116–117
CPU scheduling, 241
description, 243
cgroup
file, 141
cgroup
variable, 778
cgroupid
function, 779
cgroups
block I/O, 494
Linux kernel, 116
OS virtualization, 606, 608–611, 613–620, 630
statistics, 139, 141, 620–622, 627–628
cgtop
tool, 621
Characterizing memory usage, 325–326
Cheating in benchmarking, 650–651
Checklists
ad hoc checklist method, 43–44
benchmarking, 666
disks, 453
file systems, 387
Linux 60-second analysis, 15
memory, 325
Chip-level multiprocessing (CMP), 220
chrt
command, 295
Circular buffers for applications, 177
CISCs (complex instruction set computers), 224
clang complier, 122
Classes, scheduling
I/O, 493
priority, 295
Clean memory, 306
clear
function in bpftrace
, 780
clear
subcommand in trace-cmd
, 735
clock routine, 99
Clocks
CPUs vs. GPUs, 240
operating systems, 99
Cloud APIs, 580
vs. enterprise, 62
hardware virtualization. See Hardware virtualization
instance types, 581
lightweight virtualization, 630–633
orchestration, 586
OS virtualization. See OS virtualization
overview, 14
PMCs, 158
proof-of-concept testing, 3
scalable architecture, 581–582
types, 634
Cloud-native databases, 582
Clue-based approach in thread state analysis, 196
Clusters in cloud computing, 586
CMP (chip-level multiprocessing), 220
CNI (container network interface) software, 586
Co-routines in applications, 178
Coarse view in profiling, 35
Code changes in cloud computing, 583
Coefficient of variation (CoV), 76
Coherence
models, 63
Coherent Accelerator Processor Interface (CAPI), 236
Cold caches, 36
collectd
agent, 138
Collisions
hash, 180
networks, 516
Colors in flame graphs, 291
Column address strobe (CAS) latency, 311
comm
variable in bpftrace
, 778
Comma-separated values (CSV) format for sar, 165
Comments in bpftrace
, 767
Common case optimization in applications, 174
Communication in multiprocess vs. multithreading, 228
Community applications, 172–173
Comparing benchmarks, 648
Competition, benchmarking, 649
Compiled programming languages
overview, 183
Compilers
CPU optimization, 229
options, 295
Completely fair queueing (CFQ), 115, 449
Completely fair scheduler (CFS), 116–117
CPU scheduling, 241
description, 243
Completion target in workload analysis, 39
Complex benchmark tools, 646
Complex instruction set computers (CISCs), 224
Complexity, 5
Comprehension in flame graphs, 249
Compression
btrfs, 382
disks, 369
ZFS, 381
Compute kernel, 240
Compute Unified Device Architecture (CUDA), 240
Concurrency
CONFIG_TASK_DELAY_ACCT option, 145
Configuration
applications, 172
network options, 574
Congestion avoidance and control
Linux kernel, 115
networks, 508
tuning, 570
connect
system calls, 95
Connections for networks, 509
backlogs, 507, 519–520, 556–557, 569
firewalls, 517
latency, 7, 24–25, 505–506, 528
life span, 507
local, 509
monitoring, 529
NICs, 109
QUIC, 515
UDP, 514
Container network interface (CNI) software, 586
Containers
lightweight virtualization, 631–632
orchestration, 586
resource controls, 52, 70, 613–617, 626
Contention
locks, 198
models, 63
Context switches
defined, 90
kernels, 93
Contributors to system performance technologies, 811–814
Control groups (cgroups). See cgroups
Control paths in hardware virtualization, 594
Control units in CPUs, 230
Controllers
caches, 430
disk, 426
mechanical disks, 439
micro-benchmarking, 457
Controls, resource. See Resource controls
Copy-on-write (COW) file systems, 376
btrfs, 382
ZFS, 380
Copy-on-write (COW) process strategy, 100
CoreLink Interconnects, 236
Cores
CPUs vs. GPUs, 240
defined, 220
Corrupted file system data, 365
count
function in bpftrace
, 780
CoV (coefficient of variation), 76
COW (copy-on-write) file systems, 376
btrfs, 382
ZFS, 380
COW (copy-on-write) process strategy, 100
CPCs (CPU performance counters), 156
CPI (cycles per instruction), 225
CPU affinity, 222
CPU-bound applications, 106
cpu control group, 610
CPU mode for applications, 172
CPU performance counters (CPCs), 156
CPU registers, perf-tools for, 746–747
cpu
variable in bpftrace
, 777
cpuacct control group, 610
cpudist
tool
BCC, 755
cpufreq
tool, 285
cpuinfo
tool, 142
architecture. See CPUs architecture
clock rate, 223
compiler optimization, 229
cross calls, 110
feedback-directed optimization, 122
flame graphs. See Flame graphs
garbage collection, 185
hardware virtualization, 589–592, 596–597
I/O wait, 434
instructions, defined, 220
instructions, IPC, 225
instructions, pipeline, 224
instructions, size, 224
instructions, steps, 223
instructions, width, 224
memory tradeoffs with, 27
methodology. See CPUs methodology
multiprocess and multithreading, 227–229
observability tools. See CPUs observability tools
OS virtualization, 611, 614, 627, 630
preemption, 227
priority inversion, 227
profiling. See CPUs profiling
run queues, 222
scheduling classes, 115
simultaneous multithreading, 225
subsecond-offset heat maps, 289
terminology, 220
thread pools, 178
tuning. See CPUs tuning
user time, 226
utilization, 226
utilization heat maps, 288–289
virtualization support, 588
volumes and pools, 383
word size, 229
associativity, 234
idle threads, 244
memory management units, 235
NUMA grouping, 244
P-states and C-states, 231
processors, 230
CPUs methodology
CPU binding, 253
cycle analysis, 251
performance monitoring, 251
resource controls, 253
static performance tuning, 252
tools method, 245
workload characterization, 246–247
CPUs observability tools, 254–255
GPUs, 287
hardirqs
, 282
mpstat
, 259
pidstat
, 262
sar
, 260
showboost
, 265
vmstat
, 258
CPUs profiling
CPUs tuning
compiler options, 295
exclusive CPU sets, 298
power states, 297
processor options, 299
resource controls, 298
scaling governors, 297
scheduling priority and class, 295
security boot options, 298–299
Cpusets, 116
CPU binding, 253
exclusive, 298
cpusets control group, 610, 614, 627
cpuunclaimed
tool, 755
Crash resilience, multiprocess vs. multithreading, 228
Credit-based schedulers, 595
critical-chain
command, 120
Critical paths in systemd service manager, 120
criticalstat tool
, 756
CSMA/CD (carrier sense multiple access with collision detection) algorithm, 516
CSV (comma-separated values) format for sar, 165
CUBIC algorithm for TCP congestion control, 513
CUDA (Compute Unified Device Architecture), 240
CUMASK values in MSRs, 238–239
current_tracer file, 710
curtask
variable for bpftrace
, 778
Custom benchmarks, 662
Custom load generators, 491
Cycle analysis
CPUs, 251
memory, 326
Cycles per instruction (CPI), 225
Cylinder groups in FFS, 378
Daily patterns, monitoring, 78
Data Center TCP (DCTCP) congestion control, 118, 513
Data deduplication in ZFS, 381
Data integrity in magnetic rotational disks, 438
Data paths in hardware virtualization, 594
Data Plane Development Kit (DPDK), 523
Data rate in throughput, 22
Databases
applications, 172
cloud computing, 582
Datagrams
OSI model, 502
UDP, 514
DAX (Direct Access), 118
dbslower
tool, 756
dbstat
tool, 756
Dcache (dentry cache), 375
dcsnoop
tool, 409
dcstat
tool, 409
DCTCP (Data Center TCP) congestion control, 118, 513
dd
command
DDR SDRAM (double data rate synchronous dynamic random-access memory), 313
Deadline I/O schedulers, 243, 448
DEADLINE scheduling policy, 243
DebugFS interface, 116
Decayed average, 75
Deflated disk I/O, 369
Defragmentation in XFS, 380
Degradation in scalability, 31–32
Delay accounting
kernel, 116
off-CPU analysis, 197
overview, 145
Delayed ACKs algorithm, 513
Delayed allocation
ext4, 379
XFS, 380
delete
function in bpftrace
, 780
Demand paging
BSD kernel, 113
Dentry caches (dcaches), 375
Dependencies in perf-tools, 748
Development, benchmarking for, 642
Development attribute, multiprocess vs. multithreading, 228
Devices
backlog tuning, 569
disk I/O caches, 430
hardware virtualization, 588, 594, 597
devices control group, 610
df
tool, 409
Dhrystone benchmark
CPUs, 254
simulations, 653
Diagnosis cycle, 46
diff
subcommand for perf
, 673
Differentiated Services Code Points (DSCPs), 509–510
Direct Access (DAX), 118
Direct buses, 313
Direct I/O, 366
Direct mapped caches, 234
Direct measurement approach in thread state analysis, 197
Direct-reclaim memory method, 318–319
Directories in file systems, 107
Directory indexes in ext3, 379
Directory name lookup cache (DNLC), 375
Dirty memory, 306
Disk commands, 424
Disk controllers
caches, 430
magnetic rotational disks, 439
USE method, 451
Disk I/O state in thread state analysis, 194–197
Disk request time, 428
Disk response time, 428
Disk wait time, 428
architecture. See Disks architecture
I/O. See Disks I/O
IOPS, 432
methodology. See Disks methodology
models. See Disks models
non-data-transfer disk commands, 432
observability tools. See Disks observability tools
read/write ratio, 431
resource controls, 494
saturation, 434
terminology, 424
tunable, 494
USE method, 451
utilization, 433
Disks architecture
magnetic rotational disks, 435–439
operating system disk I/O stack, 446–449
persistent memory, 441
Disks I/O
vs. application I/O, 435
caching, 430
errors, 483
latency, 428–430, 454–455, 467–472, 482–483
operating system stacks, 446–449
OS virtualization strategy, 630
random vs. sequential, 430–431
scatter plots, 488
simple disk, 425
synchronous vs. asynchronous, 434–435
wait, 434
Disks methodology
cache tuning, 456
performance monitoring, 452
resource controls, 456
static performance tuning, 455–456
tools method, 450
workload characterization, 452–454
Disks models
controllers, 426
simple disk, 425
Disks observability tools, 484–486
MegaCli, 484
miscellaneous, 487
PSI, 464
SCSI event logging, 486
Dispatcher-queue latency, 222
Distributed operating systems, 123–124
Distributed tracing, 199
Distributions
normal, 75
dmesg
tool
CPUs, 245
description, 15
memory, 348
OS virtualization, 619
DNLC (directory name lookup cache), 375
Documentation
application latency, 385
bpftrace
, 781
kprobes, 153
perf-tools, 748
PMCs, 158
trace-cmd
, 740
uprobes, 155
USDT, 156
Domains
scheduling, 244
Xen, 589
Double data rate synchronous dynamic random-access memory (DDR SDRAM), 313
Double-pumped data transfer for CPUs, 237
DPDK (Data Plane Development Kit), 523
DRAM (dynamic random-access memory), 311
Drill-down analysis
slow disks case study, 17
Drivers
balloon, 597
drsnoop
tool
BCC, 756
memory, 342
DSCPs (Differentiated Services Code Points), 509–510
DTrace tool
description, 12
Solaris kernel, 114
Duplex for networks, 508
Duplicate ACK detection, 512
Duration in RED method, 53
DWARF (debugging with attributed record formats) stack walking, 216, 267, 676, 696
Dynamic instrumentation
kprobes, 151
latency analysis, 385
overview, 12
Dynamic priority in scheduling classes, 242–243
Dynamic random-access memory (DRAM), 311
Dynamic sizing in cloud computing, 583–584
Dynamic tracers, 12
Dynamic tracing
DTrace, 114
tools, 12
Dynamic USDT, 156
DynTicks, 116
e2fsck
tool, 418
Early Departure Time (EDT), 119, 524
eBPF. See Extended BPF
EBS (Elastic Block Store), 585
ECC (error-correcting code) for magnetic rotational disks, 438
ECN (Explicit Congestion Notification) field
TCP, 513
tuning, 570
EDT (Early Departure Time), 119, 524
EFS (Elastic File System), 585
EKS (Elastic Kubernetes Service), 586
elasped
variable in bpftrace
, 777
Elastic Block Store (EBS), 585
Elastic File System (EFS), 585
Elastic Kubernetes Service (EKS), 586
Elevator seeking in magnetic rotational disks, 437–438
ELF (Executable and Linking Format) binaries
description, 183
missing symbols in, 214
Embedded caches, 232
eMLC (enterprise multi-level cell) flash memory, 440
Encapsulation for networks, 504
END
probes in bpftrace
, 774
End-to-end network arguments, 507
Enterprise models, 62
Enterprise multi-level cell (eMLC) flash memory, 440
Environment
benchmarking, 647
Ephemeral drives, 584
Ephemeral ports, 531
EPTs (extended page tables), 593
Erlang virtual machines, 185
Error-correcting code (ECC) for magnetic rotational disks, 438
Errors
applications, 193
benchmarking, 647
disk controllers, 451
disk devices, 451
kernels, 798
networks, 526–527, 529, 796–797
RED method, 53
storage, 797
task capacity, 799
USE method overview, 47–48, 51–53
user mutex, 799
Ethernet congestion avoidance, 508
Event-based concurrency, 178
Event-based tools, 133
Event-select MSRs, 238
Event sources for Wireshark, 559
Event tracing
disks, 454
file systems, 388
trace-cmd
for, 737
Event worker threads, 178
Events
observability source, 159
perf
. See perf
tool events
SCSI logging, 486
trace, 148
events directory in tracefs, 710
Eviction policies for caching, 36
evlist
subcommand for perf
, 673
Exceptions
synchronous interrupts, 97
user mode, 93
Exclusive CPU sets, 298
exec
system calls
kernel, 94
processes, 100
execsnoop
tool
BCC, 756
CPUs, 285
perf-tools, 743
tracing, 136
Executable and Linking Format (ELF) binaries
description, 183
missing symbols in, 214
Executable data in process virtual address space, 319
Executable text in process virtual address space, 319
execve
system call, 11
exit
function in bpftrace
, 770, 779
Experimentation-based performance gains, 73–74
Experiments
observability, 7
Experts for applications, 173
Explicit Congestion Notification (ECN) field
TCP, 513
tuning, 570
Explicit logical metadata in file systems, 368
Exporters for monitoring, 55, 79, 137
Express Data Path (XDP) technology
description, 118
event sources, 558
kernel bypass, 523
ext4 file system
features, 379
Extended BPF, 12
bpftrace
752–753, 761–781, 803–808
description, 118
firewalls, 517
histograms, 744
kernel-mode applications, 92
tracing tools, 166
Extended page tables (EPTs), 593
Extent-based file systems, 375–376
btrfs, 382
ext4, 380
External caches, 232
FaaS (functions as a service), 634
FACK (forward acknowledgments) in TCP, 514
Factor analysis in capacity planning, 71–72
Failures, benchmarking, 645–651
Fair-share schedulers, 595
False sharing for hash tables, 181
Families of instance types, 581
Fast File System (FFS)
description, 113
Fast open in TCP, 510
Fast recovery in TCP, 510
Fast retransmits in TCP, 510, 512
Fast user-space mutex (Futex), 115
Fastpath state in Mutex locks, 179
Faults
in synchronous interrupts, 97
page faults. See page faults
faults
tool, 348
FC (Fibre Channel) interface, 442–443
fd
tool, 141
Feedback-directed optimization (FDO), 122
ffaults
tool, 348
FFS (Fast File System)
description, 113
Fiber threads, 178
Fibre Channel (FC) interface, 442–443
Field-programmable gate arrays (FPGAs), 240–241
FIFO scheduling policy, 243
File descriptor capacity in USE method, 52
File offset pattern, micro-benchmarking for, 390
File stores in cloud computing, 584
File system internals, bpftrace
for, 408
File systems
access timestamps, 371
architecture. See File systems architecture
caches. See File systems caches
capacity, OS virtualization, 616
capacity, performance issues, 371
hardware virtualization, 597
I/O, logical vs. physical, 368–370
I/O, random vs. sequential, 363–364
I/O, raw and direct, 366
interfaces, 361
memory-mapped files, 367
methodology. See File systems methodology
micro-benchmark tools, 412–414
observability tools. See File systems observability tools
paging, 306
read-ahead, 365
reads, micro-benchmarking for, 61
record size tradeoffs, 27
special, 371
synchronous writes, 366
terminology, 360
types. See File systems types
File systems architecture
defined, 360
flushing, 414
hit ratio, 17
OS virtualization, 616
OS virtualization strategy, 630
tuning, 389
usage, 309
write-back, 365
File systems methodology
cache tuning, 389
disk analysis, 384
performance monitoring, 388
static performance tuning, 389
workload characterization, 386–388
workload separation, 389
File systems observability tools
cachestat
, 399
LatencyTOP
, 396
mount
, 392
opensnoop
, 397
strace
, 395
top
, 393
vmstat
, 393
File systems types
ext4, 379
FileBench tool, 414
fileslower
tool, 409
filetype
tool, 409
Filters
uprobes, 723
fio (Flexible IO Tester) tool
disks, 493
Firecracker project, 631
Firewalls, 503
misconfigured, 505
overview, 517
tuning, 574
Five Whys in drill-down analysis, 56
Flame graphs
automated, 201
colors, 291
CPU profiling, 10–11, 187–188, 278, 660–661
interactivity, 291
malloc()
bytes, 346
missing stacks, 215
perf
, 119
performance wins, 250
profiles, 278
scripts, 700
Flash-memory-based SSDs, 439–440
Flash translation layer (FTL) in solid-state drives, 440–441
Flent (FLExible Network Tester) tool, 567
Flexible IO Tester (fio) tool
disks, 493
FLExible Network Tester (Flent) tool, 567
Floating point events in perf
, 680
floating-point operations per second (FLOPS) in benchmarking, 655
Flow control in bpftrace
, 775–777
Flusher threads, 374
fmapfault
tool, 409
Format string for tracepoints, 148–149
Forward acknowledgments (FACK) in TCP, 514
4-wide processors, 224
FPGAs (field-programmable gate arrays), 240–241
Fragmentation
FFS, 377
file systems, 364
memory, 321
packets, 505
reducing, 380
Frames
defined, 500
networks, 515
OSI model, 502
free
tool
description, 15
memory, 348
OS virtualization, 619
FreeBSD
jails, 606
jemalloc, 322
kernel, 113
TSA analysis, 217
network stack, 514
performance vs. Linux, 124
TCP LRO, 523
Frequency sampling for hardware events, 682–683
Front-ends in instruction pipeline, 224
fsck time in ext4, 379
fsrwstat
tool, 409
FTL (flash translation layer) in solid-state drives, 440–441
ftrace
subcommand for perf
, 673
capabilities overview, 706–708
description, 166
hwlat, 726
options, 716
OS virtualization, 629
perf
, 741
references, 749
trace_pipe file, 715
tracing, 136
Full I/O distributions disk latency, 454
Full stack in systems performance, 1
Fully associative caches, 234
Fully-preemptible kernels, 110, 114
func
variable in bpftrace
, 778
funccount
tool
example, 747
funcgraph
tool
funclatency
tool, 757
funcslower
tool
BCC, 757
perf-tools, 744
function_graph tracer
description, 708
options, 725
function_profile_enabled file, 710
Function profiling
observability source, 159
Function tracer. See Ftrace tool
Function tracing
profiling, 248
Functional block diagrams in USE method, 49–50
Functional units in CPUs, 223
Functions as a service (FaaS), 634
Functions in bpftrace
, 770, 778–781
functrace
tool, 744
Futex (fast user-space mutex), 115
futex
system calls, 95
gcc
compiler
PGO kernels, 122
gdb
tool, 136
Generic segmentation offload (GSO) in networks, 520–521
Generic system performance methodologies, 40–41
Geometric mean, 74
getdelays.c
tool, 286
github.com tool package, 132
GKE (Google Kubernetes Engine), 586
glibc
allocator, 322
Golang
goroutines, 178
syscalls, 92
Good/fast/cheap trade-offs, 26–27
Google Kubernetes Engine (GKE), 586
Goroutines for applications, 178
gprof
tool, 135
Graphics processing units (GPUs)
vs. CPUs, 240
tools, 287
GRO (Generic Receive Offload), 119
Growth
big O notation, 175
heap, 320
GSO (generic segmentation offload) in networks, 520–521
Guests
hardware virtualization, 590–593, 596–605
lightweight virtualization, 632–633
OS virtualization, 617, 627–629
gVisor project, 631
Hard disk drives (HDDs), 435–439
Hard interrupts, 282
Hardware
threads, 220
tracing, 276
Hardware-assisted virtualization, 590
Hardware counters. See Performance monitoring counters (PMCs)
Hardware events
Hardware instances in cloud computing, 580
Hardware interrupts, 91
Hardware latency detector (hwlat), 708, 726
Hardware latency tracer, 118
Hardware probes, 774
Hardware RAID, 444
Hardware resources in capacity planning, 70
Hardware virtualization
multi-tenant contention, 595
Harmonic mean, 74
Hash fields in hist triggers, 728
Hash tables in applications, 180–181
HBAs (host bus adapters), 426
HDDs (hard disk drives), 435–439
Head-based sampling in distributed tracing, 199
Heads in magnetic rotational disks, 436
Heap
anonymous paging, 306
description, 304
growth, 320
process virtual address space, 319
Heat maps
disk utilization, 490
subsecond-offset, 289
Hello, World! program, 770
hfaults
tool, 348
hist
function in bpftrace
, 780
Hist triggers
modifiers, 729
multiple keys, 730
perf-tools, 748
usage, 727
hist triggers profiler, 707
Hold times for locks, 198
Holistic approach, 6
Horizontal pod autoscalers (HPAs), 73
Horizontal scaling and scalability
capacity planning, 72
Host bus adapters (HBAs), 426
Hosts
applications, 172
cloud computing, 580
hardware virtualization, 597–603
lightweight virtualization, 632
OS virtualization, 617, 619–627
Hot caches, 37
Hot/cold flame graphs, 191
Hourly patterns, monitoring, 78
HPAs (horizontal pod autoscalers), 73
HT (HyperTransport) for CPUs, 236
htop
tool, 621
HTTP/3 protocol, 515
Hubs in networks, 516
Hue in flame graphs, 291
Huge pages, 115–116, 314, 352–353
hugetlb control group, 610
hwlat (hardware latency detector), 708, 726
Hybrid clouds, 580
Hyper-Threading Technology, 225
Hyper-V, 589
Hypercalls in paravirtualization, 588
Hyperthreading-aware scheduling classes, 243
HyperTransport (HT) for CPUs, 236
Hypervisors
cloud computing, 580
hardware virtualization, 587–588
kernels, 93
I/O. See Input/output (I/O)
IaaS (infrastructure as a service), 580
Icicle graphs, 250
icstat
tool, 409
IDDs (isolated driver domains), 596
Identification in drill-down analysis, 55
Idle memory, 315
Idle scheduling class, 243
IDLE scheduling policy, 243
Idle state in thread state analysis, 194, 196–197
ieee80211scan
tool, 561
If statements, 776
ifpps
tool, 561
iftop
tool, 562
Implicit disk I/O, 369
Implicit logical metadata, 368
Inactive pages in page caches, 318
Incast problem in networks, 524
Index nodes (inodes)
caches, 375
defined, 360
VFS, 373
Indirect disk I/O, 369
Individual synchronous writes, 366
Industry standards for benchmarking, 654–655
Inflated disk I/O, 369
Infrastructure as a service (IaaS), 580
init process, 100
Initial window in TCP, 514
inject
subcommand for perf, 673
Inodes (index nodes)
caches, 375
defined, 360
VFS, 373
inotify framework, 116
inotify
tool, 409
Input
event tracing, 58
solid-state drive controllers, 440
Input/output (I/O)
disks. See Disks I/O
file systems, 360
hardware virtualization, 593–595, 597
I/O-bound applications, 106
latency, 424
merging, 448
multiqueue schedulers, 119
OS virtualization, 611–612, 616–617
random vs. sequential, 363–364
raw and direct, 366
request time, 427
schedulers, 448
service time, 427
size, applications, 176
size, micro-benchmarking, 390
USE method, 798
wait time, 427
Input/output operations per second. See IOPS (input/output operations per second)
Input/output profiling
syscall analysis, 192
Installing
BCC, 754
bpftrace
, 762
instances directory in tracefs, 710
Instances in cloud computing
description, 14
types, 580
Instruction pointer for threads, 100
Instructions, CPU
defined, 220
IPC, 225
pipeline, 224
size, 224
steps, 223
text, 304
width, 224
Instructions per cycle (IPC), 225, 251, 326
Integrated caches, 232
Intel Cache Allocation Technology (CAT), 118, 596
Intel Clear Containers, 631
Intel processor cache sizes, 230–231
Intel VTune Amplifier XE tool, 135
Intelligent Platform Management Interface (IPMI), 98–99
Intelligent prefetch in ZFS, 381
Inter-processor interrupts (IPIs), 110
Inter-stack latency in networks, 529
Interactivity in flame graphs, 291
Interconnects
buses, 313
Interfaces
defined, 500
file systems, 361
kprobes, 153
network negotiation, 508
scheduling in NAPI, 522
Interleaving in FFS, 378
Internet Protocol (IP)
congestion avoidance, 508
sockets, 509
Interpretation of flame graphs, 291–292
Interpreted programming languages, 184–185
Interrupt coalescing mode for networks, 522
Interrupt-disabled mode, 98
Interrupt service requests (IRQs), 96–97
Interrupt service routines (ISRs), 96
Interrupts
defined, 91
hardware, 282
network latency, 529
overview, 96
synchronous, 97
interrupts
tool, 142
interval
probes in bpftrace
, 774
Interval statistics, stat
for, 693
IO accounting, 116
io_submit
command, 181
io_uring_enter
command, 181
io_uring interface, 119
ioctl
system calls, 95
iolatency
tool, 743
ioping
tool, 492
ioprofile
tool, 409
IOPS (input/output operations per second)
defined, 22
description, 7
performance metric, 32
resource analysis, 38
iosched
tool, 487
iosnoop
tool, 743
iostat
tool
bonnie++ tool, 658
description, 15
fixed counters, 134
memory, 348
options, 460
percent busy metric, 33
slow disks case study, 17
IP (Internet Protocol)
congestion avoidance, 508
sockets, 509
ipc control group, 608
IPC (instructions per cycle), 225, 251, 326
ipecn
tool, 561
iperf
tool
network micro-benchmarking, 10
IPIs (inter-processor interrupts), 110
IPMI (Intelligent Platform Management Interface), 98–99
iproute2 tool package, 132
IRQs (interrupt service requests), 96–97
irqsoff tracer, 708
iscpu
tool, 285
Isolated driver domains (IDDs), 596
Isolation in OS virtualization, 629
ISRs (interrupt service routines), 96
istopo tool
, 286
Java
analysis, 29
Java Flight Recorder, 135
stack traces, 215
symbols, 214
uprobes, 213
virtual machines, 185
Java Flight Recorder (JFR), 135
JavaScript Object Notation (JSON) format, 163–164
JBOD (just a bunch of disks), 443
jemalloc
allocator, 322
JFR (Java Flight Recorder), 135
JIT (just-in-time) compilation
Linux kernel, 117
PGO kernels, 122
runtime missing symbols, 214
Jitter in operating systems, 99
jmaps tool, 214
join
function, 778
Journaling
btrfs, 382
file systems, 376
XFS, 380
JSON (JavaScript Object Notation) format, 163–164
Jumbo frames
packets, 505
tuning, 574
Just a bunch of disks (JBOD), 443
Just-in-time (JIT) compilation
Linux kernel, 117
PGO kernels, 122
runtime missing symbols, 214
kaddr
function, 779
Kata Containers, 631
KCM (Kernel Connection Multiplexor), 118
Keep-alive strategy in networks, 507
Kendall’s notation for queueing systems, 67–68
Kernel-based Virtual Machine (KVM) technology
CPU quotas, 595
description, 589
I/O path, 594
Linux kernel, 116
Kernel bypass for networks, 523
Kernel Connection Multiplexor (KCM), 118
Kernel mode, 93
Kernel page table isolation (KPTI) patches, 121
Kernel space, 90
Kernel state in thread state analysis, 194–197
Kernel statistics (Kstat) framework, 159–160
Kernel time
CPUs, 226
syscall analysis, 192
Kernels
bpftrace
for, 765
BSD, 113
comparisons, 124
defined, 90
file systems, 107
filtering in OS virtualization, 629
microkernels, 123
monolithic, 123
PGO, 122
PMU events, 680
preemption, 110
Solaris, 114
stacks, 103
time analysis, 202
unikernels, 123
Unix, 112
USE method, 798
KernelShark software, 83–84, 739–740
kfunc
probes, 774
killsnoop
tool
BCC, 756
perf-tools, 743
klockstat
tool, 756
kmem
subcommand for perf
, 673, 702
Knee points
scalability, 31
Known-knowns, 37
Known-unknowns, 37
kprobe_events file, 710
kprobe
probes, 774
kprobe profiler, 707
kprobe
tool, 744
profiling, 722
return values, 721
kprobes tracer, 708
KPTI (kernel page table isolation) patches, 121
kretfunc
probes, 774
kstack
function in bpftrace
, 779
kstack
variable in bpftrace
, 778
Kstat (kernel statistics) framework, 159–160
ksym
function, 779
kubectl
command, 621
Kubernetes
node, 608
orchestration, 586
KVM. See Kernel-based Virtual Machine (KVM) technology
kvm_entry
tool, 602
kvm_exit
tool, 602
kvm
subcommand for perf, 673, 702
kvm_vcpu_halt
command, 592
Kyber multi-queue schedulers, 449
L2ARC cache in ZFS, 381
Label selectors in cloud computing, 586
Language virtual machines, 185
Large Receive Offload (LRO), 116
Large segment offload for packet size, 505
Last-level caches (LLCs), 232
Latency
applications, 173
defined, 22
disk I/O, 428–430, 454–455, 467–472, 482–483
file systems, 362–363, 384–386, 388
hardware, 118
hardware virtualization, 604
interrupts, 98
networks, connections, 7, 24–25, 505–506, 528
networks, defined, 500
outliers, 58, 186, 424, 471–472
performance metric, 32
run-queue, 222
solid-state drives, 441
ticks, 99
transaction costs analysis, 385–386
LatencyTOP
tool for file systems, 396
latencytop
tool for operating systems, 116
Lazy shootdowns, 367
LBR (last branch record), 216, 676, 696
Leak detection for memory, 326–327
Least frequently used (LFU) caching algorithm, 36
Least recently used (LRU) caching algorithm, 36
Level 1 caches
data, 232
instructions, 232
memory, 314
Level 2 ARC, 381
Level 2 caches
embedded, 232
memory, 314
Level 3 caches
LLC, 232
memory, 314
Level of appropriateness in methodologies, 28–29
LFU (least frequently used) caching algorithm, 36
lhist function, 780
libpcap library as observability source, 159
Life cycle for processes, 100–101
Life span
network connections, 507
solid-state drives, 441
Lightweight threads, 178
Lightweight virtualization
overhead, 632
overview, 630
resource controls, 632
Limit investigations, benchmarking for, 642
Limitations of averages, 75
Limits for OS virtualization resources, 613
limits
tool, 141
Line charts
baseline statistics, 59
Linear scalability
methodologies, 32
models, 63
Link aggregation tuning, 574
Link-time optimization (LTO), 122
Linux 60-second analysis, 15–16
Linux operating system
KPTI patches, 121
observability sources, 138–146
observability tools, 130
operating system disk I/O stack, 447–448
static performance tools, 130–131
systemd service manager, 120
thread state analysis, 195–197
linux-tools-common linux-tools tool package, 132
list
subcommand
perf
, 673
trace-cmd
, 735
Listen backlogs in networks, 519
listen
subcommand in trace-cmd
, 735
Listing events
trace-cmd
for, 736
Little’s Law, 66
Live reporting in sar
, 165
LLCs (last-level caches), 232
llcstat
tool
BCC, 756
CPUs, 285
Load averages for uptime
, 255–257
Load balancers
capacity planning, 72
schedulers, 241
Load generation
capacity planning, 70
custom load generators, 491
micro-benchmarking, 61
Load vs. architecture in methodologies, 30–31
loadavg
tool, 142
Local memory, 312
Local network connections, 509
Localhost network connections, 509
Lock state in thread state analysis, 194–197
lock
subcommand for perf
, 673, 702
Locks
analysis, 198
Logging
applications, 172
SCSI events, 486
ZFS, 381
Logical CPUs
defined, 220
hardware threads, 221
Logical I/O
defined, 360
Logical metadata in file systems, 368
Logical operations in file systems, 361
Longest-latency caches, 232
Loopbacks in networks, 509
LRO (Large Receive Offload), 116
LRU (least recently used) caching algorithm, 36
lsof tool, 561
LTO (link-time optimization), 122
LTTng tool, 166
M/G/1 queueing systems, 68
M/M/1 queueing systems, 68
M/M/c queueing systems, 68
MADV_COLD option, 119
MADV_PAGEOUT option, 119
madvise
system call, 367, 415–416
Magnetic rotational disks, 435–439
Main memory
latency, 26
malloc()
bytes flame graphs, 346
Map functions in bpftrace
, 771–772, 780–781
Map variables in bpftrace
, 771
Mapping memory. See Memory mappings
maps
tool, 141
Marketing, benchmarking for, 642
Markov model, 654
Markovian arrivals in queueing systems, 68–69
max
function in bpftrace
, 780
Maximum controller operation rate, 457
Maximum controller throughput, 457
Maximum disk operation rate, 457
Maximum disk random reads, 457
Maximum disk throughput
magnetic rotational disks, 436–437
micro-benchmarking, 457
Maximum transmission unit (MTU) size for packets, 504–505
MCS locks, 117
mdflush
tool, 487
Mean, 74
"A Measure of Transaction Processing Power," 655
Medians, 75
MegaCli tool, 484
Melo, Arnaldo Carvalho de, 671
Meltdown vulnerability, 121
mem
subcommand for perf, 673
meminfo
tool, 142
memleak
tool
BCC, 756
memory, 348
architecture. See Memory architecture
bpftrace
for, 763–764, 804–805
BSD kernel, 113
CPU tradeoffs with, 27
file system cache usage, 309
garbage collection, 185
hardware virtualization, 596–597
mappings. See Memory mappings
methodology. See Memory methodology
multiprocess vs. multithreading, 228
NUMA binding, 353
observability tools. See Memory observability tools
OS virtualization, 611, 613, 615–616
OS virtualization strategy, 630
overcommit, 308
overprovisioning in solid-state drives, 441
persistent, 441
shared, 310
shrinking method, 328
terminology, 304
utilization and saturation, 309
word size, 310
working set size, 310
Memory architecture, 311
CPU caches, 314
latency, 311
MMU, 314
process virtual address space, 319–322
TLB, 314
memory control group, 610, 616
Memory locality, 222
Memory management units (MMUs), 235, 314
Memory mappings
files, 367
hardware virtualization, 592–593
heap growth, 320
kernel, 94
micro-benchmarking, 390
OS virtualization, 611
Memory methodology
cycle analysis, 326
memory shrinking, 328
micro-benchmarking, 328
overview, 323
performance monitoring, 326
resource controls, 328
static performance tuning, 327–328
usage characterization, 325–326
Memory observability tools
drsnoop
, 342
swapon
, 331
Memory reclaim state in delay accounting, 145
Metadata
ext3, 378
Method R, 57
ad hoc checklist method, 43–44
applications. See Applications methodology
baseline statistics, 59
benchmarking. See Benchmarking methodology
cache tuning, 60
CPUs. See CPUs methodology
diagnosis cycle, 46
disks. See Disks methodology
file systems. See File systems methodology
known-unknowns, 37
level of appropriateness, 28–29
Linux 60-second analysis checklist, 15–16
memory. See Memory methodology
Method R, 57
modeling. See Methodologies modeling
networks. See Networks methodology
performance mantras, 61
point-in-time recommendations, 29–30
problem statement, 44
profiling, 35
RED method, 53
static performance tuning, 59–60
stop indicators, 29
tools method, 46
visualizations. See Methodologies visualizations
workload characterization, 54
Methodologies modeling, 62
Amdahl’s Law of Scalability, 64–65
enterprise vs. cloud, 62
Universal Scalability Law, 65–66
Methodologies visualizations, 79
tools, 85
applications, 172
resource analysis, 38
workload analysis, 40
MFU (most frequently used) caching algorithm, 36
Micro-benchmarking
capacity planning, 70
description, 13
file systems, 390–391, 412–414
memory, 328
networks, 533
Micro-operations (uOps), 224
Microcode ROM in CPUs, 230
Microservices
USE method, 53
Midpath state for Mutex locks, 179
Migration types for free lists, 317
min
function in bpftrace
, 780
MINIX operating system, 114
Minor faults, 307
MIPS (millions of instructions per second) in benchmarking, 655
Misleading benchmarks, 650
Missing symbols, 214
Mixed-mode CPU profiles, 187
Mixed-mode flame graphs, 187
MLC (multi-level cell) flash memory, 440
mmap
sys call
description, 95
mmapfiles
tool, 409
mmapsnoop
tool, 348
mmiotrace tracer, 708
MMUs (memory management units), 235, 314
mnt control group, 609
Mode switches
defined, 90
kernels, 93
Model-specific registers (MSRs)
CPUs, 238
observability source, 159
Models
Amdahl’s Law of Scalability, 64–65
enterprise vs. cloud, 62
overview, 62
Universal Scalability Law, 65–66
Modular I/O scheduling, 116
CPUs, 251
disks, 452
drill-down analysis, 55
file systems, 388
memory, 326
products, 79
summary-since-boot values, 79
Most frequently used (MFU) caching algorithm, 36
Most recently used (MRU) caching algorithm, 36
Mount points in file systems, 106
mount
tool
file systems, 392
Mounting file systems, 106, 392
mountsnoop
tool, 409
mpstat
tool
description, 15
fixed counters, 134
lightweight virtualization, 633
OS virtualization, 619
mq-deadline multi-queue schedulers, 449
MR-IOV (multiroot I/O virtualization), 593–594
MRU (most recently used) caching algorithm, 36
MSG_ZEROCOPY flag, 119
msr-tools tool package, 132
MSRs (model-specific registers)
CPUs, 238
observability source, 159
mtr
tool, 567
Multi-level cell (MLC) flash memory, 440
Multi-queue schedulers
description, 119
operating system disk I/O stack, 449
Multiblock allocators in ext4, 379
Multicalls in paravirtualization, 588
Multicast network transmissions, 503
Multichannel memory buses, 313
Multics (Multiplexed Information and Computer Services) operating system, 112
Multimodal distributions, 76–77
MultiPath TCP, 119
Multiple causes as performance challenge, 6
Multiple performance issues, 6
Multiple prefetch streams in ZFS, 381
Multiple-zone disk recording, 437
Multiplexed Information and Computer Services (Multics) operating system, 112
Multiprocessors
overview, 110
Solaris kernel support, 114
Multiqueue block I/O, 117
Multiqueue I/O schedulers, 119
Multiroot I/O virtualization (MR-IOV), 593–594
Multitenancy in cloud computing, 580
contention in hardware virtualization, 595
contention in OS virtualization, 612–613
Multithreading
SMT, 225
Mutex (MUTually EXclusive) locks
contention, 198
USE method, 52
MySQL database
CPU profiling, 200, 203, 269–270, 277, 283–284, 697–700
disk I/O tracing, 466–467, 470–471, 488
file tracing, 397–398, 401–402
memory allocation, 345
Off–CPU analysis, 204–205, 275–276
Off–CPU Time flame graphs, 190–192
query latency analysis, 56
scheduler latency, 272, 279–280
shards, 582
slow query log, 172
stack traces, 215
working set size, 342
mysqld_qslower
tool, 756
NAGLE algorithm for TCP congestion control, 513
Name resolution latency, 505, 528
Namespaces in OS virtualization, 606–609, 620, 623–624
NAPI (New API) framework, 522
NAS (network-attached storage), 446
Native Command Queueing (NCQ), 437
Native hypervisors, 587
Negative caching in Dcache, 375
Nested page tables (NPTs), 593
net control group, 609
net_cls control group, 610
Net I/O state in thread state analysis, 194–197
net_prio control group, 610
net
tool
description, 562
socket information, 142
Netfilter conntrack as observability source, 159
Netflix cloud performance team, 2–3
netlink observability tools, 145–146, 536
netsize
tool, 561
nettxlat
tool, 561
Network-attached storage (NAS), 446
Network interface cards (NICs)
network connections, 109
sent and received packets, 522
architecture. See Networks architecture
benchmark questions, 668
bpftrace
for, 764–765, 807–808
congestion avoidance, 508
connection backlogs, 507
encapsulation, 504
hardware virtualization, 597
interface negotiation, 508
interfaces, 501
local connections, 509
methodology. See Networks methodology
micro-benchmarking for, 61
observability tools. See Networks observability tools
on-chip interfaces, 230
operating systems, 109
OS virtualization, 611–613, 617, 630
protocol stacks, 502
protocols, 504
routing, 503
sniffing, 159
terminology, 500
tuning. See Networks tuning
Networks architecture
Networks methodology
micro-benchmarking, 533
performance monitoring, 529
static performance tuning, 531–532
TCP analysis, 531
tools method, 525
workload characterization, 527–528
Networks observability tools
tcplife
, 548
tcptop
, 549
Wireshark, 560
Networks tuning, 567
configuration, 574
socket options, 573
New API (NAPI) framework, 522
New Vegas (NV) congestion control algorithm, 118
nfsdist
tool
BCC, 756
file systems, 399
nfsslower
tool, 756
nfsstat
tool, 561
NFU (not frequently used) caching algorithm, 36
nice
command
CPU priorities, 252
resource management, 111
scheduling priorities, 295
NICs (network interface cards)
network connections, 109
sent and received packets, 522
nicstat
tool, 132, 525, 545–546
"A Nine Year Study of File System and Storage Benchmarking," 643
Nitro hardware virtualization
description, 589
NMIs (non-maskable interrupts), 98
NO_HZ_FULL option, 117
Node taints in cloud computing, 586
Node.js
dynamic USDT, 156
event-based concurrency, 178
non-blocking I/O, 181
symbols, 214
Nodes
cloud computing, 586
free lists, 317
main memory, 312
Noisy neighbors
multitenancy, 585
OS virtualization, 617
Non-blocking I/O
applications, 181
Non-data-transfer disk commands, 432
Non-idle time, 34
Non-maskable interrupts (NMIs), 98
Non-regression testing
benchmarking for, 642
software change case study, 18
Non-uniform memory access (NUMA)
CPUs, 244
main memory, 312
memory balancing, 117
memory binding, 353
multiprocessors, 110
Non-uniform random distributions, 413
Non-Volatile Memory express (NVMe) interface, 443
Noop I/O schedulers, 448
nop tracer, 708
Normal distribution, 75
NORMAL scheduling policy, 243
Not frequently used (NFU) caching algorithm, 36
NPTs (nested page tables), 593
nsecs
variable in bpftrace
, 777
nsenter
command, 624
ntop
function, 779
NUMA. See Non-uniform memory access (NUMA)
numactl
tool package, 132
Number of service centers in queueing systems, 67
NV (New Vegas) congestion control algorithm, 118
nvmelatency
tool, 487
O(1) scheduling class, 243
Object stores in cloud computing, 584
Observability
allocators, 321
applications, 174
benchmarks, 643
counters, statistics, and metrics, 8–9
hardware virtualization, 597–605
operating systems, 111
OS virtualization. See OS virtualization observability
RAID, 445
volumes and pools, 383
Observability tools, 129
applications. See Applications observability tools
coverage, 130
CPUs. See CPUs observability tools
disks. See Disks observability tools
exercises, 168
file system. See File systems observability tools
memory. See Memory observability tools
network. See Networks observability tools
profiling, 135
types, 133
Observability tools sources, 138–140
delay accounting, 145
Observation-based performance gains, 73
Observational tests in scientific method, 44–45
Observer effect in metrics, 33
off-CPU
thread state analysis, 197
time flame graphs, 205
offcputime
tool
BCC, 756
description, 285
networks, 561
scheduler tracing, 190
slow disks case study, 17
time flame graphs, 205
Offset heat maps, 289, 489–490
offwaketime
tool, 756
On-chip caches, 231
On-die caches, 231
On-disk caches, 425–426, 430, 437
Online balancing, 382
Online defragmentation, 380
OOM killer (out-of-memory killer), 316–317, 324
OOM (out of memory), defined, 304
oomkill
tool
BCC, 756
description, 348
open
command
description, 94
non-blocking I/O, 181
Open Container Interface, 586
openat
syscalls, 404
opensnoop
tool
BCC, 756
file systems, 397
perf-tools, 743
Operating systems, 89
clocks and idle, 99
defined, 90
hybrid kernels, 123
jitter, 99
Linux. See Linux operating system
microkernels, 123
multiprocessors, 110
networking, 109
observability, 111
PGO kernels, 122
preemption, 110
unikernels, 123
virtualization. See OS virtualization
Operation rate
defined, 22
Operations
applications, 172
defined, 360
micro-benchmarking, 390
Operators for bpftrace
, 776–777
OProfile system profiler, 115
oprofile
tool, 285
Optimistic spinning in Mutex locks, 179
Optimizations
applications, 174
feedback-directed, 122
networks, 524
Orchestration in cloud computing, 586
Ordered mode in ext3, 378
Orlov block allocator, 379
OS instances in cloud computing, 580
OS virtualization
OS virtualization observability
tracing tools, 629
OS X syscall tracing, 205
OS wait time for disks, 472
OSI model, 502
Out-of-memory killer (OOM killer), 316–317, 324
Out of memory (OOM), defined, 304
Out-of-order packets, 529
Outliers
heat maps, 82
normal distributions, 77
Output formats in sar
, 163–165
Output with solid-state drive controllers, 440
Overcommit strategy, 115
Overcommitted main memory, 305, 308
Overflow sampling
hardware events, 683
Overhead
hardware virtualization, 589–595
kprobes, 153
lightweight virtualization, 632
metrics, 33
multiprocess vs. multithreading, 228
strace
, 207
ticks, 99
tracepoints, 150
volumes and pools, 383
Overlayfs file system, 118
Overprovisioning cloud computing, 583
override
function, 779
Oversize arenas, 322
P-caches in CPUs, 230
P-states in CPUs, 231
Pacing in networks, 524
Packages, CPUs vs. GPUs, 240
Packets
defined, 500
networks, 504
OSI model, 502
out-of-order, 529
throttling, 522
Padding locks for hash tables, 181
Page caches
file systems, 374
memory, 315
Page faults
defined, 304
Page-outs
daemons, 317
working with, 306
Page scanning, 318–319, 323, 374
Page tables, 235
Paged virtual memory, 113
Pages
defined, 304
kernel, 115
Paging
file system, 306
overview, 306
PAPI (performance application programming interface), 158
Parallelism in applications, 177–181
Paravirtualization (PV), 588, 590
Paravirtualized I/O drivers, 593–595
Parity in RAID, 445
Partitions in Hyper-V, 589
Passive listening in three-way handshakes, 511
pathchar
tool, 564
Pathologies in solid-state drives, 441
Patrol reads in RAID, 445
Pause frames in congestion avoidance, 508
pchar
tool, 564
PCI pass-through in hardware virtualization, 593
PCP (Performance Co-Pilot), 138
PE (Portable Executable) format, 183
PEBS (precise event-based sampling), 158
Per-I/O latency values, 454
Per-interval I/O averages latency values, 454
Per-interval statistics with stat
, 693
Per-process observability tools, 133
profiling, 135
tracing, 136
Percent busy metric, 33
Percentiles
description, 75
perf c2c command, 118
perf_event control group, 610
perf-stat-hist
tool, 744
perf
tool, 13
CPU flame graphs, 201
CPU profiling, 200–201, 245, 268–270
description, 116
documentation, 276
events. See perf
tool events
hardware tracing, 276
hardware virtualization, 601–602, 604
kernel time analysis, 202
memory, 324
one-liners for counting events, 675
one-liners for disks, 467
one-liners for dynamic tracing, 677–678
one-liners for listing events, 674–675
one-liners for memory, 338–339
one-liners for profiling, 675–676
one-liners for reporting, 678–679
one-liners for static tracing, 676–677
page fault flame graphs, 340–342
profiling overview, 135
subcommands. See perf
tool subcommands
thread state analysis, 196
tools collection. See perf-tools collection
perf
tool events
perf
tool subcommands
documentation, 703
ftrace
, 741
perf-tools collection
coverage, 742
documentation, 748
example, 747
perf-tools-unstable tool package, 132
Performance and performance monitoring
applications, 172
CPUs, 251
disks, 452
file systems, 388
memory, 326
networks, 529
OS virtualization, 620
resource analysis investments, 38
Performance application programming interface (PAPI), 158
Performance Co-Pilot (PCP), 138
Performance instrumentation counters (PICs), 156
Performance Mantras
applications, 182
list of, 61
Performance monitoring counters (PMCs), 156
challenges, 158
cycle analysis, 251
documentation, 158
memory, 326
Performance monitoring unit (PMU) events, 156, 680
perftrace
tool, 136
Periods in OS virtualization, 615
Persistent memory, 441
Personalities in FileBench, 414
Perspectives
Perturbations
benchmarks, 648
system tests, 23
pfm-events, 681
PGO (profile-guided optimization) kernels, 122
Physical I/O
defined, 360
Physical metadata in file systems, 368
Physical operations in file systems, 361
Physical resources in USE method, 795–798
PICs (performance instrumentation counters), 156
pid control group, 609
pid
variable in bpftrace
, 777
pids control group, 610
PIDs (process IDs)
process environment, 101
pidstat
tool
description, 15
OS virtualization, 619
thread state analysis, 196
Pipelines in ZFS, 381
pktgen
tool, 567
Platters in magnetic rotational disks, 435–436
Plugins for monitoring software, 137
pmcarch
tool
memory, 348
PMCs. See Performance monitoring counters (PMCs)
pmlock
tool, 212
PMU (performance monitoring unit) events, 156, 680
Pods in cloud computing, 586
Point-in-time recommendations in methodologies, 29–30
Policies for scheduling classes, 106, 242–243
poll
system call, 177
Polling applications, 177
Pooled storage
btrfs, 382
ZFS, 380
Portability of benchmarks, 643
Portable Executable (PE) format, 183
Ports
ephemeral, 531
network, 501
posix_fadvise
call, 415
Power states in processors, 297
Preallocation in ext4, 379
Precise event-based sampling (PEBS), 158
Prediction step in scientific method, 44–45
Preemption
CPUs, 227
Linux kernel, 116
operating systems, 110
schedulers, 241
Solaris kernel, 114
preemptirsqoff tracer, 708
preemptoff tracer, 708
Prefetch caches, 230
Prefetch for file systems
ZFS, 381
Presentability of benchmarks, 643
Pressure stall information (PSI)
description, 119
disks, 464
pressure
tool, 142
Price/performance ratio
applications, 173
benchmarking for, 643
print
function, 780
Priority
OS virtualization resources, 613
scheduling classes, 242–243, 295
Priority inheritance scheme, 227
Priority inversion, 227
Priority pause frames in congestion avoidance, 508
Private clouds, 580
Privilege rings in kernels, 93
probe
subcommand for perf
, 673
probe
variable in bpftrace
, 778
Probes and probe events
perf
, 685
Problem statement
determining, 44
/proc
file system observability tools, 140–143
Process-context IDs (PCIDs), 119
Process IDs (PIDs)
process environment, 101
Processes
accounting, 159
creating, 100
defined, 90
syscall analysis, 192
USE method, 52
virtual address space, 319–322
Processors
power states, 297
tuning, 299
procps
tool package, 131
Products, monitoring, 79
Profile-guided optimization (PGO) kernels, 122
profile
probes, 774
profile
tool
BCC, 756
profiling, 135
trace-cmd
, 735
Profilers
Ftrace, 707
perf-tools for, 745
Profiling
CPUs. See CPUs profiling
kprobes, 722
methodologies, 35
observability tools, 135
uprobes, 723
Program counter threads, 100
Programming languages
bpftrace
. See bpftrace
tool programming
virtual machines, 185
Prometheus monitoring software, 138
Proofs of concept
benchmarking for, 642
testing, 3
Proportional set size (PSS) in shared memory, 310
Protection rings in kernels, 93
Protocols
HTTP/3, 515
QUIC, 515
UDP, 514
ps
tool
fixed counters, 134
OS virtualization, 619
PSI. See Pressure stall information (PSI)
PSS (proportional set size) in shared memory, 310
Pterodactyl latency heat maps, 488–489
ptrace
tool, 159
Public clouds, 580
qdisc-fq
tool, 561
QEMU (Quick Emulator)
hardware virtualization, 589
lightweight virtualization, 631
qemu-system-x86
process, 600
QLC (quad-level cell) flash memory, 440
QoS (quality of service) for networks, 532–533
QPI (Quick Path Interconnect), 236–237
Quad-level cell (QLC) flash memory, 440
Quality of service (QoS) for networks, 532–533
Quantifying issues, 6
Quantifying performance gains, 73–74
Quarterly patterns, monitoring, 79
Question step in scientific method, 44–45
Queued time for disks, 472
Queueing disciplines
networks, 521
OS virtualization, 617
tuning, 571
Queues
interrupts, 98
run. See Run queues
QUIC protocol, 515
Quick Emulator (QEMU)
hardware virtualization, 589
lightweight virtualization, 631
Quick Path Interconnect (QPI), 236–237
Quotas in OS virtualization, 615
RACK (recent acknowledgments) in TCP, 514
RAID (redundant array of independent disks) architecture, 444–445
Ramping load benchmarking, 662–664
Random-access pattern in micro-benchmarking, 390
Random change anti-method, 42–43
Random I/O
latency profile, micro-benchmarking, 457
Rate transitions in networks, 517
Raw hardware event descriptors, 680
Raw tracepoints, 150
RCU (read-copy update), 115
RCU-walk (read-copy-update-walk) algorithm, 375
rdma control group, 610
Re-exec method in heap growth, 320
Read-ahead in file systems, 365
Read-copy update (RCU), 115
Read-copy-update-walk (RCU-walk) algorithm, 375
Read latency profile in micro-benchmarking, 457
Read-modify-write operation in RAID, 445
read
syscalls
description, 94
Read/write ratio in disks, 431
readahead
tool, 409
Reader/writer (RW) locks, 179
Real-time scheduling classes, 106, 253
Real-time systems, interrupt masking in, 98
Realism in benchmarks, 643
Rebuilding volumes and pools, 383
Receive Flow Steering (RFS) in networks, 523
Receive Packet Steering (RPS) in networks, 523
Receive packets in NICs, 522
Receive Side Scaling (RSS) in networks, 522–523
Recent acknowledgments (RACK) in TCP, 514
Reclaimed pages, 317
Record size, defined, 360
record
subcommand for perf
example, 672
options, 695
stack walking, 696
record
subcommand for trace-cmd
, 735
RED method, 53
Reduced instruction set computers (RISCs), 224
Redundant array of independent disks (RAID) architecture, 444–445
reg
function, 779
Regression testing, 18
Remote memory, 312
Reno algorithm for TCP congestion control, 513
Repeatability of benchmarks, 643
Replay benchmarking, 654
report
subcommand for perf
example, 672
TUI interface, 697
report
subcommand for trace-cmd
, 735
Reporting
trace-cmd
, 737
Request latency, 7
Request rate in RED method, 53
Request time in I/O, 427
Requests in workload analysis, 39
Resident memory, defined, 304
Resident set size (RSS), 308
Resilvering volumes and pools, 383
Resource analysis perspectives, 4–5, 38–39
Resource controls
cloud computing, 586
hardware virtualization, 595–597
lightweight virtualization, 632
OS virtualization, 613–617, 626–627
tuning, 571
USE method, 52
Resource isolation in cloud computing, 586
Resource limits in capacity planning, 70–71
Resource lists in USE method, 49
Resource utilization in applications, 173
Resources in USE method, 47
Response time
defined, 22
disks, 452
latency, 24
restart
subcommand in trace-cmd
, 735
Results in event tracing, 58
Retention policy for caching, 36
Retransmits
latency, 528
UDP, 514
Retrospectives, 4
Return values
kprobes, 721
kretprobes, 152
ukretprobes, 154
uprobes, 723
retval
variable in bpftrace
, 778
RFS (Receive Flow Steering) in networks, 523
Ring buffers
applications, 177
networks, 522
RISCs (reduced instruction set computers), 224
Robertson, Alastair 761
Root level in file systems, 106
Rostedt, Steven, 705, 711, 734, 739–740
Rotation time in magnetic rotational disks, 436
Round-trip time (RTT) in networks, 507, 528
Route tables, 537
Routing networks, 503
RPS (Receive Packet Steering) in networks, 523
RR scheduling policy, 243
RSS (Receive Side Scaling) in networks, 522–523
RSS (resident set size), 308
RTT (round-trip time) in networks, 507, 528
Run queues
CPUs, 222
defined, 220
latency, 222
Runnability of benchmarks, 643
Runnable state in thread state analysis, 194–197
runqlat
tool
description, 756
runqlen
tool
description, 756
runqslower
tool
CPUs, 285
description, 756
RW (reader/writer) locks, 179
S3 (Simple Storage Service), 585
SaaS (software as a service), 634
SACK (selective acknowledgment) algorithm, 514
SACKs (selective acknowledgments), 510
Sampling
CPU profiling, 35, 135, 187, 200–201, 247–248
distributed tracing, 199
Sanity checks in benchmarking, 664–665
sar
(system activity reporter)
configuration, 162
coverage, 161
CPUs, 260
description, 15
fixed counters, 134
live reporting, 165
OS virtualization, 619
overview, 160
reporting, 163
thread state analysis, 196
SAS (Serial Attached SCSI) disk interface, 442
SATA (Serial ATA) disk interface, 442
Saturation
applications, 193
CPUs, 226–227, 245–246, 251, 795, 797
defined, 22
disk controllers, 451
flame graphs, 291
I/O, 798
kernels, 798
resource analysis, 38
storage, 797
task capacity, 799
user mutex, 799
Saturation points in scalability, 31
Scalability and scaling
Amdahl’s Law of Scalability, 64–65
CPUs vs. GPUs, 240
multithreading, 227
Universal Scalability Law, 65–66
Scalability ceiling, 64
Scalable Vector Graphics (SVG) files, 164
Scaling governors, 297
Scanning pages, 318–319, 323, 374
Scatter plots
I/O latency, 488
sched
command, 141
SCHED_DEADLINE policy, 117
sched
subcommand for perf
, 272–273, 673, 702
Scheduler latency
delay accounting, 145
run queues, 222
Scheduler tracing off-CPU analysis, 189–190
Schedulers
defined, 220
hardware virtualization, 596–597
multiqueue I/O, 119
Scheduling classes
kernel, 106
priority, 295
Scheduling in Kubernetes, 586
Scratch variables in bpftrace
, 770–771
scread
tool, 409
script
subcommand
flame graphs, 700
script
subcommand for perf
, 673
Scrubbing file systems, 376
SCSI (Small Computer System Interface)
disks, 442
event logging, 486
scsilatency
tool, 487
scsiresult
tool, 487
SDT events, 681
Second-level caches in file systems, 362
Sectors in disks
defined, 424
size, 437
zoning, 437
Security boot options, 298–299
SEDA (staged event-driven architecture), 178
SEDF (simple earliest deadline first) schedulers, 595
Seek time in magnetic rotational disks, 436
seeksize
tool, 487
seekwatcher
tool, 487
Segments
defined, 304
OSI model, 502
process virtual address space, 319
Selective acknowledgment (SACK) algorithm, 514
Selective acknowledgments (SACKs), 510
Self-Monitoring, Analysis and Reporting Technology (SMART) data, 485
self
tool, 142
Semaphores for applications, 179
Send packets in NICs, 522
sendfile
command, 181
Sequential I/O
Serial ATA (SATA) disk interface, 442
Serial Attached SCSI (SAS) disk interface, 442
Server instances in cloud computing, 580
Service consoles in hardware virtualization, 589
Service thread pools for applications, 178
Service time
defined, 22
Set associative caches, 234
set_ftrace_filter file, 710
Shadow page tables, 593
Shadow statistics, 694
Shards
capacity planning, 73
cloud computing, 582
Shared memory, 310
Shared system buses, 312
Shares in OS virtualization, 614–615, 626
Shell scripting, 184
Shingled Magnetic Recording (SMR) drives, 439
shmsnoop
tool, 348
Short-lived processes, 12, 207–208
Short-stroking in magnetic rotational disks, 437
signal
function, 779
Simple disk model, 425
Simple earliest deadline first (SEDF) schedulers, 595
Simple Network Management Protocol (SNMP), 55, 137
Simple Storage Service (S3), 585
Simulation benchmarking, 653–654
Simultaneous multithreading (SMT), 220, 225
Single-level cell (SLC) flash memory, 440
Single root I/O virtualization (SR-IOV), 593
Site reliability engineers (SREs), 4
Size
disk sectors, 437
free lists, 317
instruction, 224
virtual memory, 308
working set. See Working set size (WSS)
sizeof
function, 779
skbdrop
tool, 561
skblife
tool, 561
Slab
allocator, 114
process virtual address space, 321–322
slabinfo
tool, 142
slabtop
tool, 333–334, 394–395
SLC (single-level cell) flash memory, 440
Sleeping state in thread state analysis, 194–197
Sliding windows in TCP, 510
SLOG log in ZFS, 381
Sloth disks, 438
Slow-start in TCP, 510
Slowpath state in Mutex locks, 179
Small Computer System Interface (SCSI)
disks, 442
event logging, 486
smaps
tool, 141
SMART (Self-Monitoring, Analysis and Reporting Technology) data, 485
SMP (symmetric multiprocessing), 110
smpcalls
tool, 285
SMR (Shingled Magnetic Recording) drives, 439
SMs (streaming multiprocessors), 240
SMT (simultaneous multithreading), 220, 225
Snapshots
btrfs, 382
ZFS, 381
SNMP (Simple Network Management Protocol), 55, 137
SO_BUSY_POLL socket option, 522
SO_REUSEPORT socket option, 117
SO_TIMESTAMP socket option, 529
SO_TIMESTAMPING socket option, 529
so1stbyte
tool, 561
soaccept
tool, 561
socketio
tool, 561
Sockets
BSD, 113
defined, 500
description, 109
local connections, 509
options, 573
tuning, 569
socksize
tool, 561
sockstat
tool, 561
soconnect
tool, 561
soconnlat
tool, 561
sofamily
tool, 561
Software
Software as a service (SaaS), 634
Software change case study, 18–19
Software events
observability source, 159
recording and tracing, 275–276
software
probes, 774
Software resources
capacity planning, 70
Solaris
kernel, 114
Kstat, 160
syscall tracing, 205
top tool Solaris mode, 262
Solid-state disks (SSDs)
cache devices, 117
soprotocol
tool, 561
sormem
tool, 561
Source code for applications, 172
SPEC (Standard Performance Evaluation Corporation) benchmarks, 655–656
Special file systems, 371
Speedup with latency, 7
Spin locks
applications, 179
contention, 198
queued, 118
splice
call, 116
SPs (streaming processors), 240
SR-IOV (single root I/O virtualization), 593
SREs (site reliability engineers), 4
ss
tool, 145–146, 525, 534–536
SSDs (solid-state disks)
cache devices, 117
Stack helpers, 214
Stack traces
description, 102
Stacks
JIT symbols, 214
operating system disk I/O, 446–449
overview, 102
process virtual address space, 319
protocol, 502
user and kernel, 103
Staged event-driven architecture (SEDA), 178
Stall cycles in CPUs, 223
Standard deviation, 75
Standard Performance Evaluation Corporation (SPEC) benchmarks, 655–656
Starovoitov, Alexei, 121
start
subcommand in trace-cmd
, 735
Starvation in deadline I/O schedulers, 448
stat
subcommand in perf
description, 635
interval statistics, 693
per-CPU balance, 693
shadow statistics, 694
stat
subcommand in trace-cmd
, 735
Stateful workload simulation, 654
Stateless workload simulation, 653
Statelessness of UDP, 514
States
thread state analysis, 193–197
Static instrumentation
perf
events, 681
Static performance tuning
applications methodology, 198–199
CPUs, 252
file systems, 389
Static priority of threads, 242–243
Static probes, 116
Static tracing in perf, 676–677
Statistical analysis in benchmarking, 665–666
baseline, 59
coefficient of variation, 76
multimodal distributions, 76–77
outliers, 77
quantifying performance gains, 73–74
standard deviation, percentiles, and median, 75
statm
tool, 141
stats
function, 780
statsnoop
tool, 409
status
tool, 141
stop subcommand in trace-cmd
, 735
Storage
benchmark questions, 668
disks. See Disks
Storage array caches, 430
Storage arrays, 446
strace
tool
bonnie++ tool, 660
file system latency, 395
limitations, 202
networks, 561
overhead, 207
tracing, 136
stream
subcommand in trace-cmd
, 735
Streaming multiprocessors (SMs), 240
Streaming processors (SPs), 240
Streaming workloads in disks, 430–431
Streetlight effect, 42
Stress testing in software change case study, 18
Stripe width of volumes and pools, 383
Striped allocation in XFS, 380
strncmp
function, 778
Stub domains in hardware virtualization, 596
Subjectivity, 5
Subsecond-offset heat maps, 289
sum
function in bpftrace
, 780
Summary-since-boot values monitoring, 79
Superblocks in VFS, 373
superping
tool, 561
Superscalar architectures for CPUs, 224
SUT (system under test) models, 23
SVG (Scalable Vector Graphics) files, 164
Swap areas, defined, 304
Swap capacity in OS virtualization, 613, 616
swapin
tool, 348
swapon
tool
disks, 487
memory, 331
Swapping
defined, 304
Swapping state
delay accounting, 145
thread state analysis, 194–197
Symbol churn, 214
Symbols, missing, 214
Symmetric multiprocessing (SMP), 110
SYN backlogs, 519
Synchronization primitives for applications, 179
Synchronous interrupts, 97
Synchronous writes, 366
syncsnoop
tool
BCC, 756
file systems, 409
Synthetic events in hist triggers, 731–733
SysBench system benchmark, 294
syscount
tool
BCC, 756
CPUs, 285
file systems, 409
perf-tools, 744
sysctl
tool
congestion control, 570
schedulers, 296
SCSI logging, 486
sysstat tool package, 131
System activity reporter. See sar
(system activity reporter)
System calls
analysis, 192
connect latency, 528
defined, 90
file system latency, 385
micro-benchmarking for, 61
observability source, 159
send/receive latency, 528
System design, benchmarking for, 642
system
function in bpftrace
, 770, 779
System statistics, monitoring, 138
System under test (SUT) models, 23
System-wide CPU profiling, 268–270
System-wide observability tools, 133
fixed counters, 134
profiling, 135
tracing, 136
System-wide tunable parameters
byte queue limits, 571
device backlog, 569
ECN, 570
production example, 568
queueing disciplines, 571
resource controls, 571
sockets and TCP buffers, 569
TCP backlog, 569
TCP congestion control, 570
Tuned Project, 572
systemd-analyze
command, 120
systemd service manager, 120
Systems performance overview, 1–2
cascading failures, 5
cloud computing, 14
complexity, 5
counters, statistics, and metrics, 8–9
multiple performance issues, 6
SystemTap tool, 166
Tagged Command Queueing (TCQ), 437
Tahoe algorithm for TCP congestion control, 513
Tail-based sampling in distributed tracing, 199
Tail Loss Probe (TLP), 117, 512
Task capacity in USE method, 799
task
tool, 141
Tasklets with interrupts, 98
Tasks
defined, 90
idle, 99
taskset
command, 297
tc
tool, 566
tcdump
tool, 136
TCMalloc
allocator, 322
TCP. See Transmission Control Protocol (TCP)
TCP Fast Open (TFO), 117
TCP/IP stack
BSD, 113
kernels, 109
protocol, 502
stack bypassing, 509
TCP segmentation offload (TSO), 521
TCP Small Queues (TSQ), 524
TCP Tail Loss Probe (TLP), 117
TCP TIME_WAIT latency, 528
tcpaccept
tool, 561
tcpconnect
tool, 561
tcpdump
tool
BPF for, 12
description, 526
tcplife tool
BCC, 756
description, 525
overview, 548
tcpnagle
tool, 561
tcpreplay
tool, 567
tcpretrans
tool
BCC, 756
perf-tools, 743
tcptop
tool
BCC, 756
description, 526
top processes, 549
tcpwin
tool, 561
TCQ (Tagged Command Queueing), 437
Temperature-aware scheduling classes, 243
Temperature sensors for CPUs, 230
Tenancy in cloud computing, 580
contention in hardware virtualization, 595
contention in OS virtualization, 612–613
Tensor processing units (TPUs), 241
Test errors in benchmarking, 646–647
Text step in scientific method, 44–45
Text user interface (TUI), 697
TFO (TCP Fast Open), 117
Theoretical maximum disk throughput, 436–437
Thermal pressure in Linux kernel, 119
THP (transparent huge pages)
Linux kernel, 116
memory, 353
Thread blocks in GPUs, 240
Thread pools in USE method, 52
Thread state analysis, 193–194
software change case study, 19
Threads
CPUs vs. GPUs, 240
defined, 90
flusher, 374
hardware, 221
lightweight, 178
micro-benchmarking, 653
processes, 100
SMT, 225
USE method, 52
3-wide processors, 224
3D NAND flash memory, 440
3D XPoint persistent memory, 441
Three-way handshakes in TCP, 511
Throttling
benchmarks, 661
hardware virtualization, 597
OS virtualization, 626
packets, 522
Throughput
applications, 173
defined, 22
disks, 424
file systems, 360
magnetic rotational disks, 436–437
networks, defined, 500
networks, monitoring, 529
performance metric, 32
resource analysis, 38
solid-state drives, 441
workload analysis, 40
Ticks, clock, 99
tid
variable in bpftrace
, 777
Time
averages over, 74
event tracing, 58
kernel analysis, 202
Time-based patterns in monitoring, 77–78
time control group, 609
time
function in bpftrace
, 778
Time scales
Time-series metrics, 8
Time sharing for schedulers, 241
Time slices for schedulers, 242
Time to first byte (TTFB) in networks, 506
TIME_WAIT latency, 528
TIME_WAIT state, 512
timechart
subcommand for perf
, 673
Timer-based profile sampling, 247–248
Timer-based retransmits, 512
Timerless multitasking, 117
Timestamps
CPU counters, 230
file systems, 371
TCP, 511
tiptop
tool, 348
tiptop
tool package, 132
TLBs. See Translation lookaside buffers (TLBs)
tlbstat
tool
memory, 348
TLC (tri-level cell) flash memory, 440
TLP (Tail Loss Probe), 117, 512
TLS (transport layer security), 113
Tools method
CPUs, 245
disks, 450
networks, 525
overview, 46
Top-level directories, 107
Top of file system layer, file system latency in, 385
top
subcommand for perf
, 673
top
tool
description, 15
file systems, 393
fixed counters, 135
hardware virtualization, 600
lightweight virtualization, 632–633
TPC (Transaction Processing Performance Council) benchmarks, 655
tpoint
tool, 744
TPUs (tensor processing units), 241
trace-cmd
front end, 132
documentation, 740
function_graph, 739
overview, 734
trace_options file, 710
trace_stat directory, 710
trace
subcommand for perf
, 673, 701–702
tracepoint
probes, 774
Tracepoints
arguments and format string, 148–149
description, 11
Linux kernel, 116
overhead, 150
overview, 146
triggers, 718
tracepoints tracer, 707
Tracing
bpftrace
. See bpftrace
tool
distributed, 199
dynamic instrumentation, 12
events. See Event tracing
Ftrace. See Ftrace tool
observability tools, 136
OS virtualization, 620, 624–625, 629
perf-tools for, 745
tools, 166
trace-cmd
. See trace-cmd
front end
tracing_on file, 710
Trade-offs in methodologies, 26–27
Traffic control utility in networks, 566
Transaction costs of latency, 385–386
Transaction groups (TXGs) in ZFS, 381
Transaction Processing Performance Council (TPC) benchmarks, 655
Translation lookaside buffers (TLBs)
CPUs, 232
flushing, 121
MMU, 235
shootdowns, 367
Translation storage buffers (TSBs), 235
Transmission Control Protocol (TCP)
analysis, 531
anti-bufferbloat, 117
autocorking, 117
backlog, tuning, 569
congestion algorithms, 115
congestion avoidance, 508
congestion control, 118, 513, 570
connection latency, 24, 506, 528
duplicate ACK detection, 512
first-byte latency, 528
friends, 509
initial window, 514
Large Receive Offload, 116
lockless listener, 118
New Vegas, 118
offload in packet size, 505
out-of-order packets, 529
retransmits, 117, 512, 528–529
SACK, FACK, and RACK, 514
three-way handshakes, 511
Transmit Packet Steering (XPS) in networks, 523
Transparent huge pages (THP)
Linux kernel, 116
memory, 353
Transport, defined, 424
Transport layer security (TLS), 113
Traps
defined, 90
synchronous interrupts, 97
Tri-level cell (TLC) flash memory, 440
Triggers
hist. See Hist triggers
tracepoints, 718
uprobes, 723
Troubleshooting, benchmarking for, 642
TSBs (translation storage buffers), 235
tshark
tool, 559
TSO (TCP segmentation offload), 521
TSQ (TCP Small Queues), 524
TTFB (time to first byte) in networks, 506
TUI (text user interface), 697
Tunable parameters
disks, 494
micro-benchmarking, 390
networks, 567
point-in-time recommendations, 29–30
tradeoffs with, 27
Tuned Project, 572
Tuning
benchmarking for, 642
caches, 60
CPUs. See CPUs tuning
disk caches, 456
file system caches, 389
static performance. See Static performance tuning
turboboost
tool, 245
TXGs (transaction groups) in ZFS, 381
Type 1 hypervisors, 587
Type 2 hypervisors, 587
uaddr
function, 779
Ubuntu Linux distribution
sar configuration, 162
UDP Generic Receive Offload (GRO), 119
UDP (User Datagram Protocol), 514
udpconnect
tool, 561
UDS (Unix domain sockets), 509
uid
variable in bpftrace
, 777
UIDs (user IDs) for processes, 101
UIO (user space I/O) in kernel bypass, 523
ulimit
command, 111
Ultra Path Interconnect (UPI), 236–237
UMA (uniform memory access) memory system, 311–312
UMA (universal memory allocator), 322
Unicast network transmissions, 503
UNICS (UNiplexed Information and Computing Service), 112
Unified buffer caches, 374
Uniform memory access (UMA) memory system, 311–312
UNiplexed Information and Computing Service (UNICS), 112
Units of time for latency, 25
Universal memory allocator (UMA), 322
Universal Scalability Law (USL), 65–66
Unix domain sockets (UDS), 509
Unix kernels, 112
UnixBench benchmarks, 254
Unknown-unknowns, 37
Unrelated disk I/O, 368
unroll
function, 776
UPI (Ultra Path Interconnect), 236–237
uprobe_events file, 710
uprobe profiler, 707
uprobe
tool, 744
bpftrace
, 774
documentation, 155
example, 154
filters, 723
Ftrace, 708
interface and overload, 154–155
Linux kernel, 117
overview, 153
profiling, 723
return values, 723
triggers, 723
uptime
tool
CPUs, 245
description, 15
OS virtualization, 619
uretprobes, 154
usdt
probes, 774
USDT (user-level static instrumentation events)
perf
, 681
USDT (user-level statically defined tracing), 11, 155–156
USE method. See Utilization, saturation, and errors (USE) method
User address space in processes, 102
User allocation stacks, 345
user control group, 609
User Datagram Protocol (UDP), 514
User IDs (UIDs) for processes, 101
User land, 90
User-level static instrumentation events (USDT)
perf
, 681
User-level statically defined tracing (USDT), 11, 155–156
User mutex in USE method, 799
User space, defined, 90
User space I/O (UIO) in kernel bypass, 523
User stacks, 103
User state in thread state analysis, 194–197
User time in CPUs, 226
username
variable in bpftrace
, 777
USL (Universal Scalability Law), 65–66
ustack
function in bpftrace
, 779
ustack
variable in bpftrace
, 778
usym
function, 779
util-linux
tool package, 131
Utilization
CPUs, 226, 245–246, 251, 795, 797
defined, 22
disk controllers, 451
disk devices, 451
I/O, 798
kernels, 798
networks, 508–509, 526–527, 796–797
performance metric, 32
resource analysis, 38
task capacity, 799
user mutex, 799
Utilization, saturation, and errors (USE) method
applications, 193
benchmarking, 661
functional block diagrams, 49–50
microservices, 53
overview, 47
references, 799
resource controls, 52
resource lists, 49
slow disks case study, 17
software resources, 52, 798–799
uts control group, 609
V-NAND (vertical NAND) flash memory, 440
valgrind
tool
CPUs, 286
memory, 348
Variable block sizes in file systems, 375
Variables in bpftrace
, 770–771, 777–778
Variance
benchmarks, 647
description, 75
Variation, coefficient of, 76
vCPUs (virtual CPUs), 595
Verification of observability tool results, 167–168
Versions
applications, 172
Vertical NAND (V-NAND) flash memory, 440
Vertical scaling
capacity planning, 72
cloud computing, 581
VFIO (virtual function I/O) drivers, 523
VFS. See Virtual file system (VFS)
VFS layer, file system latency analysis in, 385
vfs_read
function in bpftrace
, 772–773
vfs_read
tool in Ftrace, 706–707
vfscount
tool, 409
vfssize
tool, 409
vfsstat
tool, 409
Vibration in magnetic rotational disks, 438
Virtual CPUs (vCPUs), 595
Virtual disks
defined, 424
utilization, 433
Virtual file system (VFS)
defined, 360
description, 107
interface, 373
Solaris kernel, 114
Virtual function I/O (VFIO) drivers, 523
Virtual machine managers (VMMs)
cloud computing, 580
hardware virtualization, 587–605
Virtual machines (VMs)
cloud computing, 580
hardware virtualization, 587–605
programming languages, 185
Virtual memory
BSD kernel, 113
overview, 305
size, 308
Virtual processors, 220
Virtual-to-guest physical translation, 593
Virtualization
hardware. See Hardware virtualization
OS. See OS virtualization
Visual identification of models, 62–64
Visualizations, 79
blktrace
, 479
flame graphs. See Flame graphs
heat maps. See Heat maps
tools, 85
VMMs (virtual machine managers)
cloud computing, 580
hardware virtualization, 587–588
VMs (virtual machines)
cloud computing, 580
hardware virtualization, 587–588
programming languages, 185
vmscan
tool, 348
vmstat
tool, 8
description, 15
disks, 487
file systems, 393
fixed counters, 134
hardware virtualization, 604
OS virtualization, 619
thread state analysis, 196
VMware ESX, 589
Volume managers, 360
Volumes
defined, 360
W-caches in CPUs, 230
Wait time
disks, 434
I/O, 427
wakeup tracer, 708
wakeup_rt tracer, 708
wakeuptime
tool, 756
Warm caches, 37
Warmth of caches, 37
watchpoint
probes, 774
Wear leveling in solid-state drives, 441
Weekly patterns, monitoring, 79
Whys in drill-down analysis, 56
Width
instruction, 224
Windows
DiskMon, 493
fibers, 178
hybrid kernel, 92
Hyper-V, 589
LTO and PGO, 122
microkernel, 123
portable executable format, 183
ProcMon, 207
syscall tracing, 205
TIME_WAIT, 512
word size, 310
Wireshark tool, 560
Word size
CPUs, 229
memory, 310
Work queues with interrupts, 98
Working set size (WSS)
benchmarking, 664
micro-benchmarking, 390–391, 653
Workload analysis perspectives, 4–5, 39–40
Workload characterization
benchmarking, 662
methodologies, 54
workload analysis, 39
Workload separation in file systems, 389
Workloads, defined, 22
Write amplification in solid-state drives, 440
Write-back caches
file systems, 365
on-disk, 425
virtual disks, 433
write
system calls, 94
Write-through caches, 425
Write type, micro-benchmarking for, 390
writeback
tool, 409
Writes starving reads, 448
writesync
tool, 409
WSS (working set size)
benchmarking, 664
XDP (Express Data Path) technology
description, 118
event sources, 558
kernel bypass, 523
Xen hardware virtualization
CPU usage, 595
description, 589
I/O path, 594
network performance, 597
observability, 599
xentop
tool, 599
xfsdist
tool
BCC, 756
file systems, 399
xfsslower
tool, 757
XPS (Transmit Packet Steering) in networks, 523
Yearly patterns, monitoring, 79
zero
function, 780
ZFS file system
pool statistics, 410
Solaris kernel, 114
zfsdist
tool
BCC, 757
file systems, 399
zfsslower
tool, 757
ZIO pipeline in ZFS, 381
zoneinfo
tool, 142
Zones
free lists, 317
magnetic rotational disks, 437
Solaris kernel, 114
zpool
tool, 410