JavaScriptCore引擎深度解析5-字节码生成篇(下)

前言

本来想着两篇就差不多能分析完字节码的生成,结果最后发现篇幅过长了,于是重新拆分为上、中、下了,本篇是字节码生成的最后一篇
注:如果有些地方感觉解释的不清楚,请看代码中的注释

ExpressionNode

接着中篇,继续对ExpressionNode的一些子类进行字节码的细节分析,相比较StatementNodeExpressionNode会看多较多的细节指令。

ConstantNode

常量节点的字节码生成直接调用了emitLoad方法,尝试去加载该常量

RegisterID* ConstantNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    if (dst == generator.ignoredResult()) return 0;
    return generator.emitLoad(dst, jsValue(generator));
}

这里emitLoad没有传第3个参数,因为方法的默认参数是SourceCodeRepresentation::Other

enum class SourceCodeRepresentation {Other,Integer,Double};
RegisterID* BytecodeGenerator::emitLoad(RegisterID* dst, JSValue v, SourceCodeRepresentation sourceCodeRepresentation)
{
    RegisterID* constantID = addConstantValue(v, sourceCodeRepresentation);
    if (dst) return emitMove(dst, constantID);
    return constantID;
}

emitLoad先调用addConstantValue添加常量,再调emitMove将返回值移动到dst寄存器中

RegisterID* BytecodeGenerator::addConstantValue(JSValue v, SourceCodeRepresentation sourceCodeRepresentation)
{
    if (!v) return addConstantEmptyValue();
    // 当前的偏移
    int index = m_nextConstantOffset;
    ...

    EncodedJSValueWithRepresentation valueMapKey { JSValue::encode(v), sourceCodeRepresentation };
    JSValueMap::AddResult result = m_jsValueMap.add(valueMapKey, m_nextConstantOffset);

    if (result.isNewEntry)
    {
        // 如果是首次加入的项,将其加入m_constantPoolRegisters
        m_constantPoolRegisters.append(FirstConstantRegisterIndex + m_nextConstantOffset);
        // 增加偏移
        ++m_nextConstantOffset;
        // 
        m_codeBlock->addConstant(v, sourceCodeRepresentation);
    } 
    else
    {
        // 如果不是首次加入,则返回其索引
        index = result.iterator->value;
    }
    return &m_constantPoolRegisters[index];
}

emitMove实现的代码非常简单明了:op_mov dst src

RegisterID* BytecodeGenerator::emitMove(RegisterID* dst, RegisterID* src)
{
    emitOpcode(op_mov);
    instructions().append(dst->index());
    instructions().append(src->index());
    return dst;
}

这里解释下FirstConstantRegisterIndex:
static const int FirstConstantRegisterIndex = 0x40000000;
Register numbers used in bytecode operations have different meaning according to their ranges:

  • 0x80000000-0xFFFFFFFF Negative indices from the CallFrame pointer are entries in the call frame, see JSStack.h.
  • 0x00000000-0x3FFFFFFF Forwards indices from the CallFrame pointer are local vars and temporaries with the function’s callframe.
  • 0x40000000-0x7FFFFFFF Positive indices from 0x40000000 specify entries in the constant pool on the CodeBlock.

很明显,这里的字节码生成和CodeBlock密切相关,所以用到了第3个区域。

ThisNode

JavaScript中的this关键字比较怪异,它在不同的场合下代表不同的含义,不知道您是否还记得首次被其this支配的恐惧?

  • 在全局环境下,this 始终指向全局对象(window), 无论是否严格模式;
  • 普通函数内部的this分两种情况,严格模式和非严格模式: (1)非严格模式下,this 默认指向全局对象window;(2)严格模式下,this为undefined;
  • 对象内部方法的this指向调用这些方法的对象;
  • 构造函数中的this与被创建的新对象绑定;
  • 箭头函数不绑定this,它会捕获其所在(即定义的位置)上下文的this值,作为自己的this值;
RegisterID* ThisNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    generator.ensureThis();
    if (dst == generator.ignoredResult()) return 0;
    RegisterID* result = generator.moveToDestinationIfNeeded(dst, generator.thisRegister());
    return result;
}
RegisterID* BytecodeGenerator::ensureThis()
{
    // 需要注意下如果箭头函数中使用了super,需要在箭头函数的上下文中加载this
    if (constructorKind() == ConstructorKind::Derived && needsToUpdateArrowFunctionContext() && isSuperCallUsedInInnerArrowFunction())
        emitLoadThisFromArrowFunctionLexicalEnvironment();

    // Temporal Dead Zone 检查
    if (constructorKind() == ConstructorKind::Derived || isDerivedConstructorContext())
        emitTDZCheck(thisRegister());

    return thisRegister();// m_thisRegister
}

这里解释两个东东:

  • 由于箭头函数不会绑定this,所以需要为某些箭头函数加载this和super,尤其是构造函数。
    如下面代码所示:在构造函数中调用eval之前,我们需要加载this,因为super虽然代表了父类A的构造函数,但是返回的是子类B的实例,即super内部的this指的是B,因此super()在这里相当于 A.prototype.constructor.call(this),所以需要提前加载this
 var A = class A {
   constructor () { this.id = 'A'; }
 }

 var B = class B extend A {
    constructor () {
       var arrow = () => super();
       arrow();
       eval("this.id = 'B'");
    }
 }
  • Temporal Dead Zone : ES6明确规定,如果区块中存在let和const命令,这个区块对这些命令声明的变量,从一开始就形成了封闭作用域。凡是在声明之前就使用这些变量,就会报错。总之,在代码块内,使用let命令声明变量之前,该变量都是不可用的。这在语法上,称为“暂时性死区”(temporal dead zone,简称 TDZ)。这里在使用this之前,也需要检查下this代表的对象是否处于TDZ状态

moveToDestinationIfNeeded方法也很直白,不作解释

// Moves src to dst if dst is not null and is different from src, otherwise just returns src.
RegisterID* moveToDestinationIfNeeded(RegisterID* dst, RegisterID* src)
{
    return dst == ignoredResult() ? 0 : (dst && dst != src) ? emitMove(dst, src) : src;
}

SuperNode

来看看关键字super的字节码如何生成的。super关键字只能在class内部使用,外部直接调用就会出错,因为根本不知道父类的构造函数是那个。它们只是语法糖而已,JavaScript仍然是基于原型的继承,super本质上就是借用构造函数的一种表现形式:
子类必须在constructor方法中调用super方法,否则新建实例时会报错。这是因为子类没有自己的this对象,而是继承父类的this对象,然后对其进行加工。如果不调用super方法,子类就得不到this对象。SuperNode首先要搞清楚super指向谁

RegisterID* SuperNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    RegisterID* result = emitSuperBaseForCallee(generator);
    return generator.moveToDestinationIfNeeded(generator.finalDestination(dst), result);
}

static RegisterID* emitSuperBaseForCallee(BytecodeGenerator& generator)
{
    RefPtr<RegisterID> homeObject = emitHomeObjectForCallee(generator);
    return generator.emitGetById(generator.newTemporary(), homeObject.get(), generator.propertyNames().underscoreProto);
}

重点在emitHomeObjectForCallee中:

static RegisterID* emitHomeObjectForCallee(BytecodeGenerator& generator)
{
    // 如果是在子类上下文或者子类的构造函数上下文中,那么从子类的构造函数中加载HomeObject
    // 实际上这个时候super是当做函数使用的:super()
    // super虽然代表了父类A的构造函数,但是返回的是子类B的实例,即super内部的this指的是B,
    // 因此super()在这里相当于A.prototype.constructor.call(this)
    if (generator.isDerivedClassContext() || generator.isDerivedConstructorContext()) {
        RegisterID* derivedConstructor = generator.emitLoadDerivedConstructorFromArrowFunctionLexicalEnvironment();
        return generator.emitGetById(generator.newTemporary(), derivedConstructor, generator.propertyNames().homeObjectPrivateName);
    }

    // 否则从调用栈中去获取,这个时候super是当做对象使用的,在普通方法中指向父类的原型对象
    // super.xxx() 相当于 A.prototype.xxx()
    RegisterID callee;
    callee.setIndex(dd::Callee);
    return generator.emitGetById(generator.newTemporary(), &callee, generator.propertyNames().homeObjectPrivateName);
}

BinaryOpNode

先看下BinaryOpNode的定义,二元操作包含左表达式,右表达式和操作符,形式如下:

m_expr1 m_opcodeID m_expr2
class BinaryOpNode : public ExpressionNode {
    public:
        RegisterID* emitStrcat(BytecodeGenerator& generator, RegisterID* destination, RegisterID* lhs = 0, ReadModifyResolveNode* emitExpressionInfoForMe = 0);
    private:
        RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

    protected:
        ExpressionNode* m_expr1;
        ExpressionNode* m_expr2;
    private:
        OpcodeID m_opcodeID;
    protected:
        bool m_rightHasAssignments;
    };
RegisterID* BinaryOpNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    OpcodeID opcodeID = this->opcodeID();

    // 如果操作符是+号,并且左表达式的结果是字符串,那么毫无疑问应该做拼接处理
    if (opcodeID == op_add && m_expr1->isAdd() && m_expr1->resultDescriptor().definitelyIsString()) {
        generator.emitExpressionInfo(position(), position(), position());
        return emitStrcat(generator, dst);
    }

    // 如果操作符是!=,且左右两个表达式至少有一个为null,那么就是使用指令op_neq_null
    // 判断是否为空
    if (opcodeID == op_neq) 
    {
        if (m_expr1->isNull() || m_expr2->isNull()) {
            RefPtr<RegisterID> src = generator.tempDestination(dst);
            generator.emitNode(src.get(), m_expr1->isNull() ? m_expr2 : m_expr1);
            return generator.emitUnaryOp(op_neq_null, generator.finalDestination(dst, src.get()), src.get());
        }
    }

    ExpressionNode* left = m_expr1;
    ExpressionNode* right = m_expr2;

    // 如果是!=或者!==,并且左表达式是一个字符串,交换左右表达式
    if (opcodeID == op_neq || opcodeID == op_nstricteq) {
        if (left->isString()) std::swap(left, right);
    }

    // 生成左边表达式的字节码
    RefPtr<RegisterID> src1 = generator.emitNodeForLeftHandSide(left, m_rightHasAssignments, right->isPure(generator));

    // 生成右表达式字节码
    RefPtr<RegisterID> src2 = generator.emitNode(right);

    ...
    // 二元运算字节码生成
    RegisterID* result = generator.emitBinaryOp(opcodeID, generator.finalDestination(dst, src1.get()), src1.get(), src2.get(), OperandTypes(left->resultDescriptor(), right->resultDescriptor()));

    if (opcodeID == op_urshift && dst != generator.ignoredResult())
        return generator.emitUnaryOp(op_unsigned, result, result);
    return result;
}

二元运算字节码的指令就很直接了:

RegisterID* BytecodeGenerator::emitBinaryOp(OpcodeID opcodeID, RegisterID* dst, RegisterID* src1, RegisterID* src2, OperandTypes types)
{
    emitOpcode(opcodeID);
    instructions().append(dst->index());
    instructions().append(src1->index());
    instructions().append(src2->index());

    if (opcodeID == op_bitor || 
        opcodeID == op_bitand || 
        opcodeID == op_bitxor ||
        opcodeID == op_add || 
        opcodeID == op_mul || 
        opcodeID == op_sub || 
        opcodeID == op_div)
        instructions().append(types.toInt());

    return dst;
}

其中有些字节码指令的长度为5:

{ "name" : "op_bitand", "length" : 5 },
{ "name" : "op_bitxor", "length" : 5 },
{ "name" : "op_bitor", "length" : 5 },
{ "name" : "op_add", "length" : 5 },
{ "name" : "op_mul", "length" : 5 },
{ "name" : "op_div", "length" : 5 },
{ "name" : "op_sub", "length" : 5 },

这些长度为5的指令需要额外增加一个OperandTypes类型,它主要表示操作数的类型,比如是否是整形、布尔值或者字符串

static const Type TypeInt32 = 1;
static const Type TypeMaybeNumber = 0x04;
static const Type TypeMaybeString = 0x08;
static const Type TypeMaybeNull   = 0x10;
static const Type TypeMaybeBool   = 0x20;
static const Type TypeMaybeOther  = 0x40;
...

一元运算符的指令只需要一个操作符和两个操作数:

RegisterID* BytecodeGenerator::emitUnaryOp(OpcodeID opcodeID, RegisterID* dst, RegisterID* src)
{
    emitOpcode(opcodeID);
    instructions().append(dst->index());
    instructions().append(src->index());
    return dst;
}

EqualNode

EqualNodeBinaryOpNode的一个子类,这二者的字节码生成的思路基本是相同的

RegisterID* EqualNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    if (m_expr1->isNull() || m_expr2->isNull()) {
        RefPtr<RegisterID> src = generator.tempDestination(dst);
        generator.emitNode(src.get(), m_expr1->isNull() ? m_expr2 : m_expr1);
        return generator.emitUnaryOp(op_eq_null, generator.finalDestination(dst, src.get()), src.get());
    }

    ExpressionNode* left = m_expr1;
    ExpressionNode* right = m_expr2;
    if (left->isString()) std::swap(left, right);

    RefPtr<RegisterID> src1 = generator.emitNodeForLeftHandSide(left, m_rightHasAssignments, m_expr2->isPure(generator));
    RefPtr<RegisterID> src2 = generator.emitNode(right);
    return generator.emitEqualityOp(op_eq, generator.finalDestination(dst, src1.get()), src1.get(), src2.get());
}

这里面的关键方法在emitEqualityOp,它负责两种等号判断:

  • 判断是否是同一种数据类型;
  • 判断值是否相等
RegisterID* BytecodeGenerator::emitEqualityOp(OpcodeID opcodeID, RegisterID* dst, RegisterID* src1, RegisterID* src2)
{
    if (m_lastOpcodeID == op_typeof) 
    {
        // 类型的比较
        int dstIndex;
        int srcIndex;

        // 取出instructions的倒数第一条指令中的两个操作数,并分别赋值给srcIndex和dstIndex
        retrieveLastUnaryOp(dstIndex, srcIndex);

        if (src1->index() == dstIndex
            && src1->isTemporary()
            && m_codeBlock->isConstantRegisterIndex(src2->index())
            && m_codeBlock->constantRegister(src2->index()).get().isString()) 
        {
            const String& value = asString(m_codeBlock->constantRegister(src2->index()).get())->tryGetValue();
            // 既然是类型都比较,当然需要各种专用指令了
            if (value == "undefined") {
                rewindUnaryOp();
                emitOpcode(op_is_undefined);
                instructions().append(dst->index());
                instructions().append(srcIndex);
                return dst;
            }
            if (value == "boolean") {
                rewindUnaryOp();
                emitOpcode(op_is_boolean);
                instructions().append(dst->index());
                instructions().append(srcIndex);
                return dst;
            }
            if (value == "number") {
                rewindUnaryOp();
                emitOpcode(op_is_number);
                instructions().append(dst->index());
                instructions().append(srcIndex);
                return dst;
            }
            if (value == "string") {
                rewindUnaryOp();
                emitOpcode(op_is_string);
                instructions().append(dst->index());
                instructions().append(srcIndex);
                return dst;
            }
            if (value == "object") {
                rewindUnaryOp();
                emitOpcode(op_is_object_or_null);
                instructions().append(dst->index());
                instructions().append(srcIndex);
                return dst;
            }
            if (value == "function") {
                rewindUnaryOp();
                emitOpcode(op_is_function);
                instructions().append(dst->index());
                instructions().append(srcIndex);
                return dst;
            }
        }
    }

    // 普通情况下的等号比较指令
    emitOpcode(opcodeID);
    instructions().append(dst->index());
    instructions().append(src1->index());
    instructions().append(src2->index());
    return dst;
}

retrieveLastUnaryOp取出instructions的倒数第一条指令中的两个操作数,并分别赋值给srcIndex和dstIndex

void BytecodeGenerator::retrieveLastUnaryOp(int& dstIndex, int& srcIndex)
{
    size_t size = instructions().size();
    dstIndex = instructions().at(size - 2).u.operand;
    srcIndex = instructions().at(size - 1).u.operand;
}

DotAccessorNode

JavaScript中,对于方法或属性的调用,都是通过点操作符来实现的

class DotAccessorNode : public ExpressionNode, public ThrowableExpressionData 
{
    public:
        DotAccessorNode(const JSTokenLocation&, ExpressionNode* base, const Identifier&);

        ExpressionNode* base() const { return m_base; }

    private:
        RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

        bool isLocation() const override { return true; }
        bool isDotAccessorNode() const override { return true; }

        ExpressionNode* m_base;
        const Identifier& m_ident;
    };
RegisterID* DotAccessorNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    // 判定点操作是否在调用父类的方法
    bool baseIsSuper = m_base->isSuperNode();

    // 如果是父类方法,调用emitSuperBaseForCallee准备好super,否则生成m_base的字节码
    RefPtr<RegisterID> base = baseIsSuper ? emitSuperBaseForCallee(generator) : generator.emitNode(m_base);

    generator.emitExpressionInfo(divot(), divotStart(), divotEnd());


    RegisterID* finalDest = generator.finalDestination(dst);
    RegisterID* ret;
    if (baseIsSuper) 
    {
        // 如果是父类方法,emitGetById需要传4个参数,并且在调用之前确定好this指向的对象
        // 主要是因为JavaScript中的super是语法糖而已,需要在调用前准备好this
        RefPtr<RegisterID> thisValue = generator.ensureThis();
        ret = generator.emitGetById(finalDest, base.get(), thisValue.get(), m_ident);
    }
    else
    {
        ret = generator.emitGetById(finalDest, base.get(), m_ident);
    }

    return ret;
}

来看下其字节码的细节:注意其中有一条指令:op_get_by_id_with_this,点操作的本质实际上就是加载一个属性或者一个方法名,还有一点需要主要,就是首次加载的时候,该属性或者方法可能还不存在,所以需要调用addConstant做一个添加动作


RegisterID* BytecodeGenerator::emitGetById(RegisterID* dst, RegisterID* base, RegisterID* thisVal, const Identifier& property)
{
    emitOpcode(op_get_by_id_with_this);
    instructions().append(kill(dst));
    instructions().append(base->index());
    instructions().append(thisVal->index());
    instructions().append(addConstant(property));
    return dst;
}

BracketAccessorNode

class BracketAccessorNode : public ExpressionNode, public ThrowableExpressionData {
    public:
        BracketAccessorNode(const JSTokenLocation&, ExpressionNode* base, ExpressionNode* subscript, bool subscriptHasAssignments);

        ExpressionNode* base() const { return m_base; }
        ExpressionNode* subscript() const { return m_subscript; }

        bool subscriptHasAssignments() const { return m_subscriptHasAssignments; }

    private:
        RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

        bool isLocation() const override { return true; }
        bool isBracketAccessorNode() const override { return true; }

        ExpressionNode* m_base;
        ExpressionNode* m_subscript;
        bool m_subscriptHasAssignments;
    };

Bracket指的是中括号[],这里的BracketAccessorNode是指A[B]形式的表达式,对号入座,m_base就是A,m_subscript就是B
在JavaScript中,对于对象,可以这样使用中括号:

var obj= {};  
// 为obj添加一个属性name,name是合法的标识符,即也可以通过obj.name方式来定义  
obj['name'] = 'jack';   
//为obj添加一个属性2a,2a不是合法的标识符(不能以数字开头),不能通过obj.2a来定义  
obj['2a'] = 'test';
RegisterID* BracketAccessorNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    // 如果obj恰好是指向当前对象的原型,那么就是一个superNode
    if (m_base->isSuperNode()) 
    {
        // 按照之前的惯例,毫无疑问需要把this和super准备好
        RefPtr<RegisterID> finalDest = generator.finalDestination(dst);
        RefPtr<RegisterID> thisValue = generator.ensureThis();
        RefPtr<RegisterID> superBase = emitSuperBaseForCallee(generator);

        if (isNonIndexStringElement(*m_subscript)) {
            // subscript是不可索引的字符串,利用Identifier来当作key
            const Identifier& id = static_cast<StringNode*>(m_subscript)->value();
            generator.emitGetById(finalDest.get(), superBase.get(), thisValue.get(), id);
        } 
        else  
        {
            // 否则生成m_subscript的字节码,并将结果放到寄存器中,充当key
            RefPtr<RegisterID> subscript = generator.emitNode(m_subscript);
            generator.emitGetByVal(finalDest.get(), superBase.get(), thisValue.get(), subscript.get());
        }

        generator.emitExpressionInfo(divot(), divotStart(), divotEnd());
        return finalDest.get();
    }

    // 和上面的代码很相似,不同的地方是superBase和base
    RegisterID* ret;
    RefPtr<RegisterID> finalDest = generator.finalDestination(dst);

    if (isNonIndexStringElement(*m_subscript)) {
        RefPtr<RegisterID> base = generator.emitNode(m_base);
        ret = generator.emitGetById(finalDest.get(), base.get(), static_cast<StringNode*>(m_subscript)->value());
    } else {
        RefPtr<RegisterID> base = generator.emitNodeForLeftHandSide(m_base, m_subscriptHasAssignments, m_subscript->isPure(generator));
        RegisterID* property = generator.emitNode(m_subscript);
        ret = generator.emitGetByVal(finalDest.get(), base.get(), property);
    }

    generator.emitExpressionInfo(divot(), divotStart(), divotEnd());
    return ret;
}

BracketAccessorNode有些情况下和DotAccessorNode是非常类似的,毕竟二者有时可以混用,最后的指令都对应到了op_get_by_id_with_this来加载一个属性或者方法,但是BracketAccessorNode有的情况下可能就是根据下角标来获取,比如数组:

var mycars=new Array()
mycars[0]="Saab"
mycars[1]="Volvo"
mycars[2]="BMW"

这种情况下就没有id了,只有val,所以最后对应到的指令是op_get_by_val_with_this

RegisterID* BytecodeGenerator::emitGetByVal(RegisterID* dst, RegisterID* base, RegisterID* thisValue, RegisterID* property)
{
    emitOpcode(op_get_by_val_with_this);
    instructions().append(kill(dst));
    instructions().append(base->index());
    instructions().append(thisValue->index());
    instructions().append(property->index());
    return dst;
}

ClassExprNode

JavaScript中类表达式节点比较复杂

class ClassExprNode final : public ExpressionNode, public VariableEnvironmentNode
{
    public:
        ClassExprNode(const JSTokenLocation&, const Identifier&, const SourceCode& classSource,
            VariableEnvironment& classEnvironment, ExpressionNode* constructorExpresssion,
            ExpressionNode* parentClass, PropertyListNode* instanceMethods, PropertyListNode* staticMethods);

    private:
        RegisterID* emitBytecode(BytecodeGenerator&, RegisterID* = 0) override;

        SourceCode m_classSource;   // 源码code             
        const Identifier& m_name;   // 类名 
        const Identifier* m_ecmaName;// ECMA类名称
        ExpressionNode* m_constructorExpression;// 构造函数节点
        ExpressionNode* m_classHeritage;        // 类的继承表达式
        PropertyListNode* m_instanceMethods;    // 实例方法链表
        PropertyListNode* m_staticMethods;      // 类方法链表
    };
RegisterID* ClassExprNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    if (!m_name.isNull())
        generator.pushLexicalScope(this, BytecodeGenerator::TDZCheckOptimization::Optimize, BytecodeGenerator::NestedScopeType::IsNested);

    RefPtr<RegisterID> superclass;
    if (m_classHeritage) {
        // 生成类的继承表达式的字节码
        superclass = generator.newTemporary();
        generator.emitNode(superclass.get(), m_classHeritage);
    }

    RefPtr<RegisterID> constructor;

    // FIXME: Make the prototype non-configurable & non-writable.
    if (m_constructorExpression) 
    {
        // 设置元类信息
        FunctionMetadataNode* metadata = static_cast<FuncExprNode*>(m_constructorExpression)->metadata();
        metadata->setEcmaName(ecmaName());
        metadata->setClassSource(m_classSource);
        // 生成构造函数的字节码
        constructor = generator.emitNode(dst, m_constructorExpression);
    } 
    else 
    {
        // 如果没有提供构造函数,则生成默认构造函数的字节码
        constructor = generator.emitNewDefaultConstructor(generator.finalDestination(dst),
            m_classHeritage ? ConstructorKind::Derived : ConstructorKind::Base,
            m_name, ecmaName(), m_classSource);
    }

    // 获取VM中的属性CommonIdentifiers,实际上就是虚拟机中的标识符
    const auto& propertyNames = generator.propertyNames();

    // 生成 new一个对象的字节码
    RefPtr<RegisterID> prototype = generator.emitNewObject(generator.newTemporary());

    if (superclass)
    {
        // 如果继承自父类
        RefPtr<RegisterID> protoParent = generator.newTemporary();
        generator.emitLoad(protoParent.get(), jsNull());

        // 如果父类还没有定义,生成一个跳转到父类的未定义标签处的字节码
        RefPtr<RegisterID> tempRegister = generator.newTemporary();
        RefPtr<Label> superclassIsUndefinedLabel = generator.newLabel();
        generator.emitJumpIfTrue(generator.emitIsUndefined(tempRegister.get(), superclass.get()), superclassIsUndefinedLabel.get());

        // 如果父类是空,生成一个跳转到父类为空的标签处的字节码
        RefPtr<Label> superclassIsNullLabel = generator.newLabel();
        generator.emitJumpIfTrue(generator.emitUnaryOp(op_eq_null, tempRegister.get(), superclass.get()), superclassIsNullLabel.get());

        // 如果父类是一个对象,也要生成进行一个标签的跳转的字节码
        RefPtr<Label> superclassIsObjectLabel = generator.newLabel();
        generator.emitJumpIfTrue(generator.emitIsObject(tempRegister.get(), superclass.get()), superclassIsObjectLabel.get());

        // 标签回填
        generator.emitLabel(superclassIsUndefinedLabel.get());
        generator.emitLabel(superclassIsObjectLabel.get());

        // 获取父类的原型
        generator.emitGetById(protoParent.get(), superclass.get(), generator.propertyNames().prototype);

        // 如果父类的原型是一个对象或者为空,也需要生成进行一个标签的跳转的字节码
        RefPtr<Label> protoParentIsObjectOrNullLabel = generator.newLabel();
        generator.emitJumpIfTrue(generator.emitUnaryOp(op_is_object_or_null, tempRegister.get(), protoParent.get()), protoParentIsObjectOrNullLabel.get());
        generator.emitJumpIfTrue(generator.emitUnaryOp(op_is_function, tempRegister.get(), protoParent.get()), protoParentIsObjectOrNullLabel.get());

        // 标签回填
        generator.emitLabel(protoParentIsObjectOrNullLabel.get());

        generator.emitDirectPutById(constructor.get(), generator.propertyNames().underscoreProto, superclass.get(), PropertyNode::Unknown);
        generator.emitLabel(superclassIsNullLabel.get());
        generator.emitDirectPutById(prototype.get(), generator.propertyNames().underscoreProto, protoParent.get(), PropertyNode::Unknown);

        emitPutHomeObject(generator, constructor.get(), prototype.get());
    }

    // 生成定义构造函数的字节码
    RefPtr<RegisterID> constructorNameRegister = generator.emitLoad(generator.newTemporary(), propertyNames.constructor);
    generator.emitCallDefineProperty(prototype.get(), constructorNameRegister.get(), constructor.get(), nullptr, nullptr,
        BytecodeGenerator::PropertyConfigurable | BytecodeGenerator::PropertyWritable, m_position);

    // 生成定义原型的字节码
    RefPtr<RegisterID> prototypeNameRegister = generator.emitLoad(generator.newTemporary(), propertyNames.prototype);
    generator.emitCallDefineProperty(constructor.get(), prototypeNameRegister.get(), prototype.get(), nullptr, nullptr, 0, m_position);

    // 生成静态方法的字节码
    if (m_staticMethods)
        generator.emitNode(constructor.get(), m_staticMethods);

    // 生成实例方法的字节码
    if (m_instanceMethods)
        generator.emitNode(prototype.get(), m_instanceMethods);

    if (!m_name.isNull()) 
    {
        Variable classNameVar = generator.variable(m_name);
        RefPtr<RegisterID> scope = generator.emitResolveScope(nullptr, classNameVar);
        generator.emitPutToScope(scope.get(), classNameVar, constructor.get(), ThrowIfNotFound, InitializationMode::Initialization);
        generator.popLexicalScope(this);
    }

    // 返回构造函数生成的对象
    return generator.moveToDestinationIfNeeded(dst, constructor.get());
}

ES5最经典的继承方法是用组合继承的方式,原型链继承方法,借用函数继承属性;
ES6也是基于这样的方式,但是封装了更优雅简洁的api,让Javascript越来越强大,修改了一些属性指向,规范了继承的操作,区分开了继承实现和实例构造

注意上面的两行代码

generator.emitCallDefineProperty(prototype.get(), constructorNameRegister.get(), constructor.get(), nullptr, nullptr,
    BytecodeGenerator::PropertyConfigurable | BytecodeGenerator::PropertyWritable, m_position);

generator.emitCallDefineProperty(constructor.get(), prototypeNameRegister.get(), prototype.get(), nullptr, nullptr, 0, m_position);

它们实现了JavaScript中典型的组合继承:(Sub继承自Base)

Sub.prototype = new Base();
Sub.prototype.constructor = Sub;

在ES6中实现了子类继承父类属性,在构造实例的时候会直接拿到子类的属性,不需要查找到原型属性上面去,ES6新的静态方法和静态属性(只能在构造函数上访问)也是通过这样类的直接继承来实现,至于普通复用方法还是放到原型链上,道理和实现和ES5是一样的。

// 生成静态方法的字节码,注意第一个参数是constructor
if (m_staticMethods)
    generator.emitNode(constructor.get(), m_staticMethods);

// 生成实例方法的字节码,注意第一个参数是prototype
if (m_instanceMethods)
    generator.emitNode(prototype.get(), m_instanceMethods);

BaseFuncExprNode

class BaseFuncExprNode : public ExpressionNode {
    public:
        FunctionMetadataNode* metadata() { return m_metadata; }

    protected:
        BaseFuncExprNode(const JSTokenLocation&, const Identifier&, FunctionMetadataNode*, const SourceCode&, FunctionMode);

        FunctionMetadataNode* m_metadata;
    };

BaseFuncExprNode有3个子类:FuncExprNodeArrowFuncExprNodeMethodDefinitionNode,这3个子类的emitBytecode代码最终都会调用到emitNewFunctionExpressionCommon

FuncExprNode

RegisterID* FuncExprNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    //finalDestination: Returns the place to write the final output of an operation.
    return generator.emitNewFunctionExpression(generator.finalDestination(dst), this);
}

RegisterID* BytecodeGenerator::emitNewFunctionExpression(RegisterID* dst, FuncExprNode* func)
{
    emitNewFunctionExpressionCommon(dst, func->metadata());
    return dst;
}

ArrowFuncExprNodeMethodDefinitionNode的代码最终都会调用到emitNewFunctionExpressionCommon,不再做详细分析

ArrowFuncExprNode

RegisterID* ArrowFuncExprNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    return generator.emitNewArrowFunctionExpression(generator.finalDestination(dst), this);
}

RegisterID* BytecodeGenerator::emitNewArrowFunctionExpression(RegisterID* dst, ArrowFuncExprNode* func)
{
    emitNewFunctionExpressionCommon(dst, func->metadata());
    return dst;
}

MethodDefinitionNode

RegisterID* MethodDefinitionNode::emitBytecode(BytecodeGenerator& generator, RegisterID* dst)
{
    return generator.emitNewMethodDefinition(generator.finalDestination(dst), this);
}

RegisterID* BytecodeGenerator::emitNewMethodDefinition(RegisterID* dst, MethodDefinitionNode* func)
{
    emitNewFunctionExpressionCommon(dst, func->metadata());
    return dst;
}

核心在这里:

void BytecodeGenerator::emitNewFunctionExpressionCommon(RegisterID* dst, FunctionMetadataNode* function)
{
    // 利用FunctionMetadataNode生成一个方法,存放到m_codeBlock的方法容器m_functionExprs中,并返回其索引值
    unsigned index = m_codeBlock->addFunctionExpr(makeFunction(function));

    // 默认值
    OpcodeID opcodeID = op_new_func_exp;
    switch (function->parseMode()) 
    {
        // 闭包
        case SourceParseMode::GeneratorWrapperFunctionMode: {
            opcodeID = op_new_generator_func_exp;
            break;
        }
        // 箭头函数
        case SourceParseMode::ArrowFunctionMode: {
            opcodeID = op_new_arrow_func_exp;
            break;
        }
        default: {
            break;
        }
    }

    // op_new_func_exp/op_new_generator_func_exp/op_new_arrow_func_exp的指令长度为4
    emitOpcode(opcodeID);
    instructions().append(dst->index());
    instructions().append(scopeRegister()->index());
    instructions().append(index);
}

逻辑也很简单,主要用到了几条指令:op_new_func_expop_new_generator_func_expop_new_arrow_func_exp

总结

字节码生成篇系列是截止到目前,分析下来比较吃力的一个系列了,其中遇到的一些困难主要是代码量大,并且需要同时对ES5、ES6以及C++同时都有一定的深入了解,由于个人能力有限,难免会出现分析不到位或者错误的地方,希望不要吝啬赐教,以便后续能逐步完善。

-------------本文结束 感谢您的阅读-------------

本文标题:JavaScriptCore引擎深度解析5-字节码生成篇(下)

文章作者:lingyun

发布时间:2018年09月15日 - 00:09

最后更新:2018年11月10日 - 12:11

原始链接:https://tsuijunxi.github.io/2018/09/15/JavaScriptCore引擎深度解析-5-字节码生成篇(下)/

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。