记录一下上课+做实验时产生的问题、思考与收获
[1] Lecture1 Introduction
[1-1] what is Static Analysis?
Static Analysis is different with dynamic analysis, analyzers need to construct static bird’s view of the whole workflows. Besides, static analyzers usually focus on the consequence of the analysis rather than the process.
[1-2] What is a good Static Analysis?
Before introducing what is a good static analysis, we need to find out what is ‘sound’ and what is ‘complete’.
sound: We describe a result is sound when it’s reasonable. When a result is sound, its false negative is low.
complete: We describe a result is complete because it’s definitely correct. When a result is complete, its false positive is low.
And in static analysis, we usually focus on soundness
rather than. I think it’s probably because we’d rather manually check the result than leave any posibility away.
[1-3] Rice’s Theory
Unfortunately, Rice has proved (maybe? I forgot) that we can’t give a perfect
static analysis to prove non-trivial properties on a re programming language.
[1-4] Example(Bird’s view of techniques)
[1-4-1] Abstract
1, 0, -1 respectively represent [+] ,[0] and [-]
x < y ? 1 : -1
represents [top] which means unknown
1 / 0
represents [down] which means undefined
[1-4-2] Over-Approximation
considering of condition
considering of control flow
[2] Lecture2 Intermediate Representation
[2-1] What’s the difference between compiler & static analyzer?
Compiler: source code -(Lexical analysis)-> Tokens -(Syntax analysis)-> AST -(Semantic analysis)-> Decorated AST -(Translator)-> IR -(Code generator)-> Binary
And static analyzers are usually based on IR layer.
[2-2] 3AC: 3-Address-Code
A common form of IR is 3AC. There is only one operator at the right side of the equation. For example:
1 | p = a + c |
[2-3] Control Flow Graph(CFG)
Control Flow Graph is constructed with nodes and edges. Nodes in graph is called Basic blocks(BB), and edges in graph is just called edge.
[2-3-1] Basic Blocks(BB)
properties:
- BB can be entered only at the start instructions.
- BB can be ended only by jump instructions
construct algorithm:
- Input: list of IR instructions
- Output: list of BBs
- Algo:
- find the leader of a block
- startpoint of the instruction list
- endpoint of a jump instruction
- sequent instruction of a jump instrcution
- break at the jump instructions
- find the leader of a block
[2-3-2] When should we add edges?
- 2 edges follow branch instruction
- 1 edge follows jump instr without condition
[3] Lecture3 Data Flow Analysis I (Overview & Applications)
[3-1] Overview of Data flow analysis
[3-1-1] what is data flow analysis?
How _ Data(Abstraction) Flow(Safe-approximation) through the _ of CFG
Node: Transfer function
Edge: Control flow function(union the signs at merges)
[3-1-2] what’s the difference between may-analysis & must-analysis?
may:
must:
[3-2] Preliminaries
[3-2-1] Input & Output state
property 1: Every execution of an IR statement transforms
property 2: The Input/output state is associated with a program point(from view of state-machine)
[3-2-2] Symbols
meet operator: ∩