静态程序分析#
- StartAt:
2022-7-16
南京大学《软件分析》课程。
1. Course Introduction#
- Sound and Complete#
- Sound by default
在静态分析中,几乎所有的分析方法都追求 Sound 而非 Complete。
- 静态分析的思维
Symbol evaluation,避免使用编程语言本身的思维。 TODO
- 静态分析的流程
抽象
近似
Transfer function <= 问题 + 语义
控制流 <= flow merging
- Top#
\(\top\),用来表示一个符号是未知的(Unknown)
- Bottom#
\(\bot\), 用来表示一个符号是未定义的(Undefined )
2. Intermediate Representation#
Intermediate Representation (IR) 是介于高级编程语言和字节码之间的一种中间表现形式,通常是 语言无关 + 机器无关的。用于编译优化和程序分析。
AST 和 IR 的异同:
常见的 IR 形式:
常见的 IR 语言:
C - 天然的汇编抽象语言
LLVM IR
Soot (3AC)
Jimple (Typed 3AC)
- Three-Addrress Code#
- Static Single-Assignment#
- Control Flow Graph#
- Basic Block#
Entry of BB must be dest of a JUMP instr. Exit of BB must be a JUMP instr.
Some Soot Stuffs#
$x
: temp var of soot.
Java invoke type:
- special:
constructor, super class mentod, private method
- virtual:
instance method call (virtual dispatch)
- interface:
can not optmizetion, check interface implementation
- static:
call static method
- dynamic:
for lambda
3. Data Flow Analysis I#
- Data Flow Analysis#
How Data is Flow on
#️⃣CFG
?- Data:
is application-specific data, an abstraction (such as
#️⃣Top
,#️⃣Bottom
)- Flow:
through the nodes(
#️⃣Basic Block
s, statements) and edges (control flows) of CFG (program). Safe-approximation?
Different data flow analysis application has:
different data abstraction
different flow safe-approximation strategies (策略)
different transfer functions and control-flow handlings
- May Analysis::#
Output information may be true (over-approximation, sound?).
Most static analyses is May Analysis.
一般情况下 state 初始化为空集。
- Must Analysis::#
Output information must be must (under-approximation, complete?).
一般情况下 state 初始化为全集。
Over- and under-approximation are both safty of analysis.
- Input/Ouput States
The set of possible data flow values is the domain for this application.
- Forward/Backward Analysis
Nothing special.
Application#
Reaching Definitions Analysis#
Intra-producedural CFG
No alias (alias: Pointer Analysis)
Dummy definition.
Bit Vectors
不动点计算
AB => A but exclude B
模版
停机:IN[S] 不变时,OUT[S] 不变
OUT[S] never shrinks
4. Data Flow Analysis II#
Live Variable Analysis#
- Abstraction
Use a bit vector, one bit for one variable assignment (
v = x
). 1 for alive and 0 for dead.- Transfer function: (no formal)
Do backward iteratiom, find variable use statements in every basic block, once found, set corrspoind bit according its value.
Merge:
1 + ? -> 1
(It is a may analysis)How to determine value?
- Initial State
All variables are dead, bit vector:
000...00
如果你有任何意见,请在此评论。 如果你留下了电子邮箱,我可能会通过 回复你。