JavaScriptCore引擎深度解析5-字节码生成篇(下)

前言

本来想着两篇就差不多能分析完字节码的生成,结果最后发现篇幅过长了,于是重新拆分为上、中、下了,本篇是字节码生成的最后一篇
注:如果有些地方感觉解释的不清楚,请看代码中的注释

ExpressionNode

接着中篇,继续对ExpressionNode的一些子类进行字节码的细节分析,相比较StatementNodeExpressionNode会看多较多的细节指令。

ConstantNode

常量节点的字节码生成直接调用了emitLoad方法,尝试去加载该常量

1
2
3
4
5
RegisterID* ConstantNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
if (dst == generator.ignoredResult()) return 0;
return generator.emitLoad(dst, jsValue(generator));
}

这里emitLoad没有传第3个参数,因为方法的默认参数是SourceCodeRepresentation::Other

1
enum class SourceCodeRepresentation {Other,Integer,Double};

1
2
3
4
5
6
RegisterID* BytecodeGenerator::emitLoad(RegisterID* dst, JSValue v, SourceCodeRepresentation sourceCodeRepresentation)
{
RegisterID* constantID = addConstantValue(v, sourceCodeRepresentation);
if (dst) return emitMove(dst, constantID);
return constantID;
}

emitLoad先调用addConstantValue添加常量,再调emitMove将返回值移动到dst寄存器中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
RegisterID* BytecodeGenerator::addConstantValue(JSValue v, SourceCodeRepresentation sourceCodeRepresentation)
{
if (!v) return addConstantEmptyValue();
// 当前的偏移
int index = m_nextConstantOffset;
...

EncodedJSValueWithRepresentation valueMapKey { JSValue::encode(v), sourceCodeRepresentation };
JSValueMap::AddResult result = m_jsValueMap.add(valueMapKey, m_nextConstantOffset);

if (result.isNewEntry)
{
// 如果是首次加入的项,将其加入m_constantPoolRegisters
m_constantPoolRegisters.append(FirstConstantRegisterIndex + m_nextConstantOffset);
// 增加偏移
++m_nextConstantOffset;
//
m_codeBlock->addConstant(v, sourceCodeRepresentation);
}
else
{
// 如果不是首次加入,则返回其索引
index = result.iterator->value;
}
return &m_constantPoolRegisters[index];
}

emitMove实现的代码非常简单明了:op_mov dst src

1
2
3
4
5
6
7
RegisterID* BytecodeGenerator::emitMove(RegisterID* dst, RegisterID* src)
{
emitOpcode(op_mov);
instructions().append(dst->index());
instructions().append(src->index());
return dst;
}

这里解释下FirstConstantRegisterIndex:
static const int FirstConstantRegisterIndex = 0x40000000;
Register numbers used in bytecode operations have different meaning according to their ranges:

  • 0x80000000-0xFFFFFFFF Negative indices from the CallFrame pointer are entries in the call frame, see JSStack.h.
  • 0x00000000-0x3FFFFFFF Forwards indices from the CallFrame pointer are local vars and temporaries with the function’s callframe.
  • 0x40000000-0x7FFFFFFF Positive indices from 0x40000000 specify entries in the constant pool on the CodeBlock.

很明显,这里的字节码生成和CodeBlock密切相关,所以用到了第3个区域。

ThisNode

JavaScript中的this关键字比较怪异,它在不同的场合下代表不同的含义,不知道您是否还记得首次被其this支配的恐惧?

  • 在全局环境下,this 始终指向全局对象(window), 无论是否严格模式;
  • 普通函数内部的this分两种情况,严格模式和非严格模式: (1)非严格模式下,this 默认指向全局对象window;(2)严格模式下,this为undefined;
  • 对象内部方法的this指向调用这些方法的对象;
  • 构造函数中的this与被创建的新对象绑定;
  • 箭头函数不绑定this,它会捕获其所在(即定义的位置)上下文的this值,作为自己的this值;
1
2
3
4
5
6
7
RegisterID* ThisNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
generator.ensureThis();
if (dst == generator.ignoredResult()) return 0;
RegisterID* result = generator.moveToDestinationIfNeeded(dst, generator.thisRegister());
return result;
}
1
2
3
4
5
6
7
8
9
10
11
12
RegisterID* BytecodeGenerator::ensureThis()
{
// 需要注意下如果箭头函数中使用了super,需要在箭头函数的上下文中加载this
if (constructorKind() == ConstructorKind::Derived && needsToUpdateArrowFunctionContext() && isSuperCallUsedInInnerArrowFunction())
emitLoadThisFromArrowFunctionLexicalEnvironment();

// Temporal Dead Zone 检查
if (constructorKind() == ConstructorKind::Derived || isDerivedConstructorContext())
emitTDZCheck(thisRegister());

return thisRegister();// m_thisRegister
}

这里解释两个东东:

  • 由于箭头函数不会绑定this,所以需要为某些箭头函数加载this和super,尤其是构造函数。
    如下面代码所示:在构造函数中调用eval之前,我们需要加载this,因为super虽然代表了父类A的构造函数,但是返回的是子类B的实例,即super内部的this指的是B,因此super()在这里相当于 A.prototype.constructor.call(this),所以需要提前加载this
1
2
3
4
5
6
7
8
9
10
11
var A = class A {
constructor () { this.id = 'A'; }
}

var B = class B extend A {
constructor () {
var arrow = () => super();
arrow();
eval("this.id = 'B'");
}
}
  • Temporal Dead Zone : ES6明确规定,如果区块中存在let和const命令,这个区块对这些命令声明的变量,从一开始就形成了封闭作用域。凡是在声明之前就使用这些变量,就会报错。总之,在代码块内,使用let命令声明变量之前,该变量都是不可用的。这在语法上,称为“暂时性死区”(temporal dead zone,简称 TDZ)。这里在使用this之前,也需要检查下this代表的对象是否处于TDZ状态

moveToDestinationIfNeeded方法也很直白,不作解释

1
2
3
4
5
// Moves src to dst if dst is not null and is different from src, otherwise just returns src.
RegisterID* moveToDestinationIfNeeded(RegisterID* dst, RegisterID* src)
{
return dst == ignoredResult() ? 0 : (dst && dst != src) ? emitMove(dst, src) : src;
}

SuperNode

来看看关键字super的字节码如何生成的。super关键字只能在class内部使用,外部直接调用就会出错,因为根本不知道父类的构造函数是那个。它们只是语法糖而已,JavaScript仍然是基于原型的继承,super本质上就是借用构造函数的一种表现形式:
子类必须在constructor方法中调用super方法,否则新建实例时会报错。这是因为子类没有自己的this对象,而是继承父类的this对象,然后对其进行加工。如果不调用super方法,子类就得不到this对象。SuperNode首先要搞清楚super指向谁

1
2
3
4
5
6
7
8
9
10
11
RegisterID* SuperNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
RegisterID* result = emitSuperBaseForCallee(generator);
return generator.moveToDestinationIfNeeded(generator.finalDestination(dst), result);
}

static RegisterID* emitSuperBaseForCallee(BytecodeGenerator& generator)
{
RefPtr<RegisterID> homeObject = emitHomeObjectForCallee(generator);
return generator.emitGetById(generator.newTemporary(), homeObject.get(), generator.propertyNames().underscoreProto);
}

重点在emitHomeObjectForCallee中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
static RegisterID* emitHomeObjectForCallee(BytecodeGenerator& generator)
{
// 如果是在子类上下文或者子类的构造函数上下文中,那么从子类的构造函数中加载HomeObject
// 实际上这个时候super是当做函数使用的:super()
// super虽然代表了父类A的构造函数,但是返回的是子类B的实例,即super内部的this指的是B,
// 因此super()在这里相当于A.prototype.constructor.call(this)
if (generator.isDerivedClassContext() || generator.isDerivedConstructorContext()) {
RegisterID* derivedConstructor = generator.emitLoadDerivedConstructorFromArrowFunctionLexicalEnvironment();
return generator.emitGetById(generator.newTemporary(), derivedConstructor, generator.propertyNames().homeObjectPrivateName);
}

// 否则从调用栈中去获取,这个时候super是当做对象使用的,在普通方法中指向父类的原型对象
// super.xxx() 相当于 A.prototype.xxx()
RegisterID callee;
callee.setIndex(dd::Callee);
return generator.emitGetById(generator.newTemporary(), &callee, generator.propertyNames().homeObjectPrivateName);
}

BinaryOpNode

先看下BinaryOpNode的定义,二元操作包含左表达式,右表达式和操作符,形式如下:

1
m_expr1 m_opcodeID m_expr2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class BinaryOpNode : public ExpressionNode {
public:
RegisterID* emitStrcat(BytecodeGenerator& generator, RegisterID* destination, RegisterID* lhs = 0, ReadModifyResolveNode* emitExpressionInfoForMe = 0);
private:
RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

protected:
ExpressionNode* m_expr1;
ExpressionNode* m_expr2;
private:
OpcodeID m_opcodeID;
protected:
bool m_rightHasAssignments;
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
RegisterID* BinaryOpNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
OpcodeID opcodeID = this->opcodeID();

// 如果操作符是+号,并且左表达式的结果是字符串,那么毫无疑问应该做拼接处理
if (opcodeID == op_add && m_expr1->isAdd() && m_expr1->resultDescriptor().definitelyIsString()) {
generator.emitExpressionInfo(position(), position(), position());
return emitStrcat(generator, dst);
}

// 如果操作符是!=,且左右两个表达式至少有一个为null,那么就是使用指令op_neq_null
// 判断是否为空
if (opcodeID == op_neq)
{
if (m_expr1->isNull() || m_expr2->isNull()) {
RefPtr<RegisterID> src = generator.tempDestination(dst);
generator.emitNode(src.get(), m_expr1->isNull() ? m_expr2 : m_expr1);
return generator.emitUnaryOp(op_neq_null, generator.finalDestination(dst, src.get()), src.get());
}
}

ExpressionNode* left = m_expr1;
ExpressionNode* right = m_expr2;

// 如果是!=或者!==,并且左表达式是一个字符串,交换左右表达式
if (opcodeID == op_neq || opcodeID == op_nstricteq) {
if (left->isString()) std::swap(left, right);
}

// 生成左边表达式的字节码
RefPtr<RegisterID> src1 = generator.emitNodeForLeftHandSide(left, m_rightHasAssignments, right->isPure(generator));

// 生成右表达式字节码
RefPtr<RegisterID> src2 = generator.emitNode(right);

...
// 二元运算字节码生成
RegisterID* result = generator.emitBinaryOp(opcodeID, generator.finalDestination(dst, src1.get()), src1.get(), src2.get(), OperandTypes(left->resultDescriptor(), right->resultDescriptor()));

if (opcodeID == op_urshift && dst != generator.ignoredResult())
return generator.emitUnaryOp(op_unsigned, result, result);
return result;
}

二元运算字节码的指令就很直接了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
RegisterID* BytecodeGenerator::emitBinaryOp(OpcodeID opcodeID, RegisterID* dst, RegisterID* src1, RegisterID* src2, OperandTypes types)
{
emitOpcode(opcodeID);
instructions().append(dst->index());
instructions().append(src1->index());
instructions().append(src2->index());

if (opcodeID == op_bitor ||
opcodeID == op_bitand ||
opcodeID == op_bitxor ||
opcodeID == op_add ||
opcodeID == op_mul ||
opcodeID == op_sub ||
opcodeID == op_div)
instructions().append(types.toInt());

return dst;
}

其中有些字节码指令的长度为5:

1
2
3
4
5
6
7
{ "name" : "op_bitand", "length" : 5 },
{ "name" : "op_bitxor", "length" : 5 },
{ "name" : "op_bitor", "length" : 5 },
{ "name" : "op_add", "length" : 5 },
{ "name" : "op_mul", "length" : 5 },
{ "name" : "op_div", "length" : 5 },
{ "name" : "op_sub", "length" : 5 },

这些长度为5的指令需要额外增加一个OperandTypes类型,它主要表示操作数的类型,比如是否是整形、布尔值或者字符串

1
2
3
4
5
6
7
static const Type TypeInt32 = 1;
static const Type TypeMaybeNumber = 0x04;
static const Type TypeMaybeString = 0x08;
static const Type TypeMaybeNull = 0x10;
static const Type TypeMaybeBool = 0x20;
static const Type TypeMaybeOther = 0x40;
...

一元运算符的指令只需要一个操作符和两个操作数:

1
2
3
4
5
6
7
RegisterID* BytecodeGenerator::emitUnaryOp(OpcodeID opcodeID, RegisterID* dst, RegisterID* src)
{
emitOpcode(opcodeID);
instructions().append(dst->index());
instructions().append(src->index());
return dst;
}

EqualNode

EqualNodeBinaryOpNode的一个子类,这二者的字节码生成的思路基本是相同的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
RegisterID* EqualNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
if (m_expr1->isNull() || m_expr2->isNull()) {
RefPtr<RegisterID> src = generator.tempDestination(dst);
generator.emitNode(src.get(), m_expr1->isNull() ? m_expr2 : m_expr1);
return generator.emitUnaryOp(op_eq_null, generator.finalDestination(dst, src.get()), src.get());
}

ExpressionNode* left = m_expr1;
ExpressionNode* right = m_expr2;
if (left->isString()) std::swap(left, right);

RefPtr<RegisterID> src1 = generator.emitNodeForLeftHandSide(left, m_rightHasAssignments, m_expr2->isPure(generator));
RefPtr<RegisterID> src2 = generator.emitNode(right);
return generator.emitEqualityOp(op_eq, generator.finalDestination(dst, src1.get()), src1.get(), src2.get());
}

这里面的关键方法在emitEqualityOp,它负责两种等号判断:

  • 判断是否是同一种数据类型;
  • 判断值是否相等
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
RegisterID* BytecodeGenerator::emitEqualityOp(OpcodeID opcodeID, RegisterID* dst, RegisterID* src1, RegisterID* src2)
{
if (m_lastOpcodeID == op_typeof)
{
// 类型的比较
int dstIndex;
int srcIndex;

// 取出instructions的倒数第一条指令中的两个操作数,并分别赋值给srcIndex和dstIndex
retrieveLastUnaryOp(dstIndex, srcIndex);

if (src1->index() == dstIndex
&& src1->isTemporary()
&& m_codeBlock->isConstantRegisterIndex(src2->index())
&& m_codeBlock->constantRegister(src2->index()).get().isString())
{
const String& value = asString(m_codeBlock->constantRegister(src2->index()).get())->tryGetValue();
// 既然是类型都比较,当然需要各种专用指令了
if (value == "undefined") {
rewindUnaryOp();
emitOpcode(op_is_undefined);
instructions().append(dst->index());
instructions().append(srcIndex);
return dst;
}
if (value == "boolean") {
rewindUnaryOp();
emitOpcode(op_is_boolean);
instructions().append(dst->index());
instructions().append(srcIndex);
return dst;
}
if (value == "number") {
rewindUnaryOp();
emitOpcode(op_is_number);
instructions().append(dst->index());
instructions().append(srcIndex);
return dst;
}
if (value == "string") {
rewindUnaryOp();
emitOpcode(op_is_string);
instructions().append(dst->index());
instructions().append(srcIndex);
return dst;
}
if (value == "object") {
rewindUnaryOp();
emitOpcode(op_is_object_or_null);
instructions().append(dst->index());
instructions().append(srcIndex);
return dst;
}
if (value == "function") {
rewindUnaryOp();
emitOpcode(op_is_function);
instructions().append(dst->index());
instructions().append(srcIndex);
return dst;
}
}
}

// 普通情况下的等号比较指令
emitOpcode(opcodeID);
instructions().append(dst->index());
instructions().append(src1->index());
instructions().append(src2->index());
return dst;
}

retrieveLastUnaryOp取出instructions的倒数第一条指令中的两个操作数,并分别赋值给srcIndex和dstIndex

1
2
3
4
5
6
void BytecodeGenerator::retrieveLastUnaryOp(int& dstIndex, int& srcIndex)
{
size_t size = instructions().size();
dstIndex = instructions().at(size - 2).u.operand;
srcIndex = instructions().at(size - 1).u.operand;
}

DotAccessorNode

JavaScript中,对于方法或属性的调用,都是通过点操作符来实现的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class DotAccessorNode : public ExpressionNode, public ThrowableExpressionData 
{
public:
DotAccessorNode(const JSTokenLocation&, ExpressionNode* base, const Identifier&);

ExpressionNode* base() const { return m_base; }

private:
RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

bool isLocation() const override { return true; }
bool isDotAccessorNode() const override { return true; }

ExpressionNode* m_base;
const Identifier& m_ident;
};

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
RegisterID* DotAccessorNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
// 判定点操作是否在调用父类的方法
bool baseIsSuper = m_base->isSuperNode();

// 如果是父类方法,调用emitSuperBaseForCallee准备好super,否则生成m_base的字节码
RefPtr<RegisterID> base = baseIsSuper ? emitSuperBaseForCallee(generator) : generator.emitNode(m_base);

generator.emitExpressionInfo(divot(), divotStart(), divotEnd());


RegisterID* finalDest = generator.finalDestination(dst);
RegisterID* ret;
if (baseIsSuper)
{
// 如果是父类方法,emitGetById需要传4个参数,并且在调用之前确定好this指向的对象
// 主要是因为JavaScript中的super是语法糖而已,需要在调用前准备好this
RefPtr<RegisterID> thisValue = generator.ensureThis();
ret = generator.emitGetById(finalDest, base.get(), thisValue.get(), m_ident);
}
else
{
ret = generator.emitGetById(finalDest, base.get(), m_ident);
}

return ret;
}

来看下其字节码的细节:注意其中有一条指令:op_get_by_id_with_this,点操作的本质实际上就是加载一个属性或者一个方法名,还有一点需要主要,就是首次加载的时候,该属性或者方法可能还不存在,所以需要调用addConstant做一个添加动作

1
2
3
4
5
6
7
8
9
10

RegisterID* BytecodeGenerator::emitGetById(RegisterID* dst, RegisterID* base, RegisterID* thisVal, const Identifier& property)
{
emitOpcode(op_get_by_id_with_this);
instructions().append(kill(dst));
instructions().append(base->index());
instructions().append(thisVal->index());
instructions().append(addConstant(property));
return dst;
}

BracketAccessorNode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class BracketAccessorNode : public ExpressionNode, public ThrowableExpressionData {
public:
BracketAccessorNode(const JSTokenLocation&, ExpressionNode* base, ExpressionNode* subscript, bool subscriptHasAssignments);

ExpressionNode* base() const { return m_base; }
ExpressionNode* subscript() const { return m_subscript; }

bool subscriptHasAssignments() const { return m_subscriptHasAssignments; }

private:
RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

bool isLocation() const override { return true; }
bool isBracketAccessorNode() const override { return true; }

ExpressionNode* m_base;
ExpressionNode* m_subscript;
bool m_subscriptHasAssignments;
};

Bracket指的是中括号[],这里的BracketAccessorNode是指A[B]形式的表达式,对号入座,m_base就是A,m_subscript就是B
在JavaScript中,对于对象,可以这样使用中括号:

1
2
3
4
5
var obj= {};  
// 为obj添加一个属性name,name是合法的标识符,即也可以通过obj.name方式来定义
obj['name'] = 'jack';
//为obj添加一个属性2a,2a不是合法的标识符(不能以数字开头),不能通过obj.2a来定义
obj['2a'] = 'test';

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
RegisterID* BracketAccessorNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
// 如果obj恰好是指向当前对象的原型,那么就是一个superNode
if (m_base->isSuperNode())
{
// 按照之前的惯例,毫无疑问需要把this和super准备好
RefPtr<RegisterID> finalDest = generator.finalDestination(dst);
RefPtr<RegisterID> thisValue = generator.ensureThis();
RefPtr<RegisterID> superBase = emitSuperBaseForCallee(generator);

if (isNonIndexStringElement(*m_subscript)) {
// subscript是不可索引的字符串,利用Identifier来当作key
const Identifier& id = static_cast<StringNode*>(m_subscript)->value();
generator.emitGetById(finalDest.get(), superBase.get(), thisValue.get(), id);
}
else
{
// 否则生成m_subscript的字节码,并将结果放到寄存器中,充当key
RefPtr<RegisterID> subscript = generator.emitNode(m_subscript);
generator.emitGetByVal(finalDest.get(), superBase.get(), thisValue.get(), subscript.get());
}

generator.emitExpressionInfo(divot(), divotStart(), divotEnd());
return finalDest.get();
}

// 和上面的代码很相似,不同的地方是superBase和base
RegisterID* ret;
RefPtr<RegisterID> finalDest = generator.finalDestination(dst);

if (isNonIndexStringElement(*m_subscript)) {
RefPtr<RegisterID> base = generator.emitNode(m_base);
ret = generator.emitGetById(finalDest.get(), base.get(), static_cast<StringNode*>(m_subscript)->value());
} else {
RefPtr<RegisterID> base = generator.emitNodeForLeftHandSide(m_base, m_subscriptHasAssignments, m_subscript->isPure(generator));
RegisterID* property = generator.emitNode(m_subscript);
ret = generator.emitGetByVal(finalDest.get(), base.get(), property);
}

generator.emitExpressionInfo(divot(), divotStart(), divotEnd());
return ret;
}

BracketAccessorNode有些情况下和DotAccessorNode是非常类似的,毕竟二者有时可以混用,最后的指令都对应到了op_get_by_id_with_this来加载一个属性或者方法,但是BracketAccessorNode有的情况下可能就是根据下角标来获取,比如数组:

1
2
3
4
var mycars=new Array()
mycars[0]="Saab"
mycars[1]="Volvo"
mycars[2]="BMW"

这种情况下就没有id了,只有val,所以最后对应到的指令是op_get_by_val_with_this

1
2
3
4
5
6
7
8
9
RegisterID* BytecodeGenerator::emitGetByVal(RegisterID* dst, RegisterID* base, RegisterID* thisValue, RegisterID* property)
{
emitOpcode(op_get_by_val_with_this);
instructions().append(kill(dst));
instructions().append(base->index());
instructions().append(thisValue->index());
instructions().append(property->index());
return dst;
}

ClassExprNode

JavaScript中类表达式节点比较复杂

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class ClassExprNode final : public ExpressionNode, public VariableEnvironmentNode
{
public:
ClassExprNode(const JSTokenLocation&, const Identifier&, const SourceCode& classSource,
VariableEnvironment& classEnvironment, ExpressionNode* constructorExpresssion,
ExpressionNode* parentClass, PropertyListNode* instanceMethods, PropertyListNode* staticMethods);

private:
RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

SourceCode m_classSource; // 源码code
const Identifier& m_name; // 类名
const Identifier* m_ecmaName;// ECMA类名称
ExpressionNode* m_constructorExpression;// 构造函数节点
ExpressionNode* m_classHeritage; // 类的继承表达式
PropertyListNode* m_instanceMethods; // 实例方法链表
PropertyListNode* m_staticMethods; // 类方法链表
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
RegisterID* ClassExprNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
if (!m_name.isNull())
generator.pushLexicalScope(this, BytecodeGenerator::TDZCheckOptimization::Optimize, BytecodeGenerator::NestedScopeType::IsNested);

RefPtr<RegisterID> superclass;
if (m_classHeritage) {
// 生成类的继承表达式的字节码
superclass = generator.newTemporary();
generator.emitNode(superclass.get(), m_classHeritage);
}

RefPtr<RegisterID> constructor;

// FIXME: Make the prototype non-configurable & non-writable.
if (m_constructorExpression)
{
// 设置元类信息
FunctionMetadataNode* metadata = static_cast<FuncExprNode*>(m_constructorExpression)->metadata();
metadata->setEcmaName(ecmaName());
metadata->setClassSource(m_classSource);
// 生成构造函数的字节码
constructor = generator.emitNode(dst, m_constructorExpression);
}
else
{
// 如果没有提供构造函数,则生成默认构造函数的字节码
constructor = generator.emitNewDefaultConstructor(generator.finalDestination(dst),
m_classHeritage ? ConstructorKind::Derived : ConstructorKind::Base,
m_name, ecmaName(), m_classSource);
}

// 获取VM中的属性CommonIdentifiers,实际上就是虚拟机中的标识符
const auto& propertyNames = generator.propertyNames();

// 生成 new一个对象的字节码
RefPtr<RegisterID> prototype = generator.emitNewObject(generator.newTemporary());

if (superclass)
{
// 如果继承自父类
RefPtr<RegisterID> protoParent = generator.newTemporary();
generator.emitLoad(protoParent.get(), jsNull());

// 如果父类还没有定义,生成一个跳转到父类的未定义标签处的字节码
RefPtr<RegisterID> tempRegister = generator.newTemporary();
RefPtr<Label> superclassIsUndefinedLabel = generator.newLabel();
generator.emitJumpIfTrue(generator.emitIsUndefined(tempRegister.get(), superclass.get()), superclassIsUndefinedLabel.get());

// 如果父类是空,生成一个跳转到父类为空的标签处的字节码
RefPtr<Label> superclassIsNullLabel = generator.newLabel();
generator.emitJumpIfTrue(generator.emitUnaryOp(op_eq_null, tempRegister.get(), superclass.get()), superclassIsNullLabel.get());

// 如果父类是一个对象,也要生成进行一个标签的跳转的字节码
RefPtr<Label> superclassIsObjectLabel = generator.newLabel();
generator.emitJumpIfTrue(generator.emitIsObject(tempRegister.get(), superclass.get()), superclassIsObjectLabel.get());

// 标签回填
generator.emitLabel(superclassIsUndefinedLabel.get());
generator.emitLabel(superclassIsObjectLabel.get());

// 获取父类的原型
generator.emitGetById(protoParent.get(), superclass.get(), generator.propertyNames().prototype);

// 如果父类的原型是一个对象或者为空,也需要生成进行一个标签的跳转的字节码
RefPtr<Label> protoParentIsObjectOrNullLabel = generator.newLabel();
generator.emitJumpIfTrue(generator.emitUnaryOp(op_is_object_or_null, tempRegister.get(), protoParent.get()), protoParentIsObjectOrNullLabel.get());
generator.emitJumpIfTrue(generator.emitUnaryOp(op_is_function, tempRegister.get(), protoParent.get()), protoParentIsObjectOrNullLabel.get());

// 标签回填
generator.emitLabel(protoParentIsObjectOrNullLabel.get());

generator.emitDirectPutById(constructor.get(), generator.propertyNames().underscoreProto, superclass.get(), PropertyNode::Unknown);
generator.emitLabel(superclassIsNullLabel.get());
generator.emitDirectPutById(prototype.get(), generator.propertyNames().underscoreProto, protoParent.get(), PropertyNode::Unknown);

emitPutHomeObject(generator, constructor.get(), prototype.get());
}

// 生成定义构造函数的字节码
RefPtr<RegisterID> constructorNameRegister = generator.emitLoad(generator.newTemporary(), propertyNames.constructor);
generator.emitCallDefineProperty(prototype.get(), constructorNameRegister.get(), constructor.get(), nullptr, nullptr,
BytecodeGenerator::PropertyConfigurable | BytecodeGenerator::PropertyWritable, m_position);

// 生成定义原型的字节码
RefPtr<RegisterID> prototypeNameRegister = generator.emitLoad(generator.newTemporary(), propertyNames.prototype);
generator.emitCallDefineProperty(constructor.get(), prototypeNameRegister.get(), prototype.get(), nullptr, nullptr, 0, m_position);

// 生成静态方法的字节码
if (m_staticMethods)
generator.emitNode(constructor.get(), m_staticMethods);

// 生成实例方法的字节码
if (m_instanceMethods)
generator.emitNode(prototype.get(), m_instanceMethods);

if (!m_name.isNull())
{
Variable classNameVar = generator.variable(m_name);
RefPtr<RegisterID> scope = generator.emitResolveScope(nullptr, classNameVar);
generator.emitPutToScope(scope.get(), classNameVar, constructor.get(), ThrowIfNotFound, InitializationMode::Initialization);
generator.popLexicalScope(this);
}

// 返回构造函数生成的对象
return generator.moveToDestinationIfNeeded(dst, constructor.get());
}

ES5最经典的继承方法是用组合继承的方式,原型链继承方法,借用函数继承属性;
ES6也是基于这样的方式,但是封装了更优雅简洁的api,让Javascript越来越强大,修改了一些属性指向,规范了继承的操作,区分开了继承实现和实例构造

注意上面的两行代码

1
2
3
4
generator.emitCallDefineProperty(prototype.get(), constructorNameRegister.get(), constructor.get(), nullptr, nullptr,
BytecodeGenerator::PropertyConfigurable | BytecodeGenerator::PropertyWritable, m_position);

generator.emitCallDefineProperty(constructor.get(), prototypeNameRegister.get(), prototype.get(), nullptr, nullptr, 0, m_position);

它们实现了JavaScript中典型的组合继承:(Sub继承自Base)

1
2
Sub.prototype = new Base();
Sub.prototype.constructor = Sub;

在ES6中实现了子类继承父类属性,在构造实例的时候会直接拿到子类的属性,不需要查找到原型属性上面去,ES6新的静态方法和静态属性(只能在构造函数上访问)也是通过这样类的直接继承来实现,至于普通复用方法还是放到原型链上,道理和实现和ES5是一样的。

1
2
3
4
5
6
7
// 生成静态方法的字节码,注意第一个参数是constructor
if (m_staticMethods)
generator.emitNode(constructor.get(), m_staticMethods);

// 生成实例方法的字节码,注意第一个参数是prototype
if (m_instanceMethods)
generator.emitNode(prototype.get(), m_instanceMethods);

BaseFuncExprNode

1
2
3
4
5
6
7
8
9
class BaseFuncExprNode : public ExpressionNode {
public:
FunctionMetadataNode* metadata() { return m_metadata; }

protected:
BaseFuncExprNode(const JSTokenLocation&, const Identifier&, FunctionMetadataNode*, const SourceCode&, FunctionMode);

FunctionMetadataNode* m_metadata;
};

BaseFuncExprNode有3个子类:FuncExprNodeArrowFuncExprNodeMethodDefinitionNode,这3个子类的emitBytecode代码最终都会调用到emitNewFunctionExpressionCommon

FuncExprNode

1
2
3
4
5
6
7
8
9
10
11
RegisterID* FuncExprNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
//finalDestination: Returns the place to write the final output of an operation.
return generator.emitNewFunctionExpression(generator.finalDestination(dst), this);
}

RegisterID* BytecodeGenerator::emitNewFunctionExpression(RegisterID* dst, FuncExprNode* func)
{
emitNewFunctionExpressionCommon(dst, func->metadata());
return dst;
}

ArrowFuncExprNodeMethodDefinitionNode的代码最终都会调用到emitNewFunctionExpressionCommon,不再做详细分析

ArrowFuncExprNode

1
2
3
4
5
6
7
8
9
10
RegisterID* ArrowFuncExprNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
return generator.emitNewArrowFunctionExpression(generator.finalDestination(dst), this);
}

RegisterID* BytecodeGenerator::emitNewArrowFunctionExpression(RegisterID* dst, ArrowFuncExprNode* func)
{
emitNewFunctionExpressionCommon(dst, func->metadata());
return dst;
}

MethodDefinitionNode

1
2
3
4
5
6
7
8
9
10
RegisterID* MethodDefinitionNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
return generator.emitNewMethodDefinition(generator.finalDestination(dst), this);
}

RegisterID* BytecodeGenerator::emitNewMethodDefinition(RegisterID* dst, MethodDefinitionNode* func)
{
emitNewFunctionExpressionCommon(dst, func->metadata());
return dst;
}

核心在这里:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void BytecodeGenerator::emitNewFunctionExpressionCommon(RegisterID* dst, FunctionMetadataNode* function)
{
// 利用FunctionMetadataNode生成一个方法,存放到m_codeBlock的方法容器m_functionExprs中,并返回其索引值
unsigned index = m_codeBlock->addFunctionExpr(makeFunction(function));

// 默认值
OpcodeID opcodeID = op_new_func_exp;
switch (function->parseMode())
{
// 闭包
case SourceParseMode::GeneratorWrapperFunctionMode: {
opcodeID = op_new_generator_func_exp;
break;
}
// 箭头函数
case SourceParseMode::ArrowFunctionMode: {
opcodeID = op_new_arrow_func_exp;
break;
}
default: {
break;
}
}

// op_new_func_exp/op_new_generator_func_exp/op_new_arrow_func_exp的指令长度为4
emitOpcode(opcodeID);
instructions().append(dst->index());
instructions().append(scopeRegister()->index());
instructions().append(index);
}

逻辑也很简单,主要用到了几条指令:op_new_func_expop_new_generator_func_expop_new_arrow_func_exp

总结

字节码生成篇系列是截止到目前,分析下来比较吃力的一个系列了,其中遇到的一些困难主要是代码量大,并且需要同时对ES5、ES6以及C++同时都有一定的深入了解,由于个人能力有限,难免会出现分析不到位或者错误的地方,希望不要吝啬赐教,以便后续能逐步完善。

-------------本文结束 感谢您的阅读-------------

本文标题:JavaScriptCore引擎深度解析5-字节码生成篇(下)

文章作者:lingyun

发布时间:2018年09月15日 - 00:09

最后更新:2018年11月10日 - 12:11

原始链接:https://tsuijunxi.github.io/2018/09/15/JavaScriptCore引擎深度解析-5-字节码生成篇(下)/

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。

坚持原创技术分享,您的支持将鼓励我继续创作!

本文标题:JavaScriptCore引擎深度解析5-字节码生成篇(下)

文章作者:lingyun

发布时间:2018年09月15日 - 00:09

最后更新:2018年11月10日 - 12:11

原始链接:https://tsuijunxi.github.io/2018/09/15/JavaScriptCore引擎深度解析-5-字节码生成篇(下)/

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。