



•







Von-Neumann Processor Journey (Thus far)

1st, 2nd, 3rd, & 4th - level caches

































Host CouplingImage: CouplingImage:

























 

 Outline

 • History of Consumer Level Graphics

 • Motivation

 • The Modern Graphics Pipeline

 • Shader Programs

 • GPU Architectural Features

 • Other uses for GPUs

 Based on "From Shader Code to a Teraflop: How GPU Shader Cores Work", By Kayvon Fatahalian, Stanford University

 33

<section-header><image><image><image><image><image>











40



**Graphics Engines Conceptual model** • Apply simple sequential programs to all items in a set • Eg, Vertices, Faces, Fragments, Pixels • Many programs (called shaders) connected in series to form a graphics pipeline Vertices







Flynn's Taxonomy
Proposed by Michael Flynn in 1966
SISD – single instruction, single data

Traditional uniprocessor

SIMD – single instruction, multiple data
Execute the same instruction on many data elements
Vector machines, graphics engines
MIMD – multiple instruction, multiple data
Each processor executes its own instructions
Multicores are all built this way
SPMD – single program, multiple data (extension proposed by Frederica Darema)
MIMD machine, each node is executing the same code
MISD – multiple instruction, single data
Systolic array













Feeding Cores with Data
Recall that we removed the hardware that allows the CPU to avoid stalls
OOE, branch predictors and prefetching all gone
Question remains: How do we avoid execution stalls?

Managing In-Flight Instruction Stream









