如何優雅地 Hack 用戶的代碼

作者：theanarkh 2022-05-24 06:07:48

本文介紹一些一種在 JS 層面 hack 用戶代碼的方式。

前言：做基礎技術的時候，會經常碰到一個問題就是如何讓自己提供的代碼對用戶少侵入，無感。比如我提供了一個 SDK 收集 Node.js 進程的 HTTP 請求耗時，最簡單的方式就是給用戶提供一個 request 方法，然后讓用戶統一調用，這樣我就可以在 request 里拿到這些數據。但是這種方式很多時候并不方便，這時候我們就需要去 hack Node.js 的 HTTP 模塊或者給 Node.js 提 PR。在操作系統層面，有提供很多技術解決這種問題，比如 ebpf、uprobe、kprobe。但是應用層無法使用這種技術解決我們的問題，因為操作系統的這些技術針對的是底層的函數，比如我想知道一個 JS 函數的耗時，只能在 V8 層面或者 JS 層面去解決，V8 這方面似乎也沒有提供很好能力，所以目前我們更多是考慮純 JS 或者 Node.js 內核層面。本文介紹一些一種在 JS 層面 hack 用戶代碼的方式。

在 Node.js 中，統計 JS 函數的耗時通常的做法是 cpu profile，但是這種方式只能拿到一段時間的耗時，如果我想實時收集耗時數據，cpu profile 就有點難搞，最直接的就是定時收集 cpu profile 數據，然后我們手動去解析 profile 數據然后上報。除了這種方式外，本文介紹另外一種方式。就是通過 hack JS 代碼的方式。假如有以下一個函數。

function compute() {
    // do something
}

如果我們想統計這種函數的執行耗時，最自然的方式就是在函數的開始和結束的地方插入一些代碼。但是我們不希望這種事情讓用戶手動去做，而是使用一種更優雅的方式。那就是通過分析源碼，拿到 AST，然后重寫 AST。我們看看怎么做。

const acorn = require('acorn');
const escodegen = require('escodegen');
const b = require('ast-types').builders;
const walk = require("acorn-walk");
const fs = require('fs');

// 分析源碼，拿到 AST
const ast = acorn.parse(fs.readFileSync('./test.js', 'utf-8'), {
    ecmaVersion: 'latest',
});

function inject(node) {
    // 在函數前后插入代碼
    const entryNode = b.variableDeclaration('const', [b.variableDeclarator(b.identifier('start'), b.callExpression(
        b.identifier('(() => { return Date.now(); })'), [],
    ))]);
    const exitNode = b.returnStatement(b.callExpression(
        b.identifier('((start) => {console.log(Date.now() - start);})'), [ 
            b.identifier('start')
        ],
    ));

    if (node.body.body) {
        node.body.body.unshift(entryNode);
        node.body.body.push(exitNode);
    }
}

// 遍歷 AST，修改 AST
walk.simple(ast, {
    ArrowFunctionExpression: inject,
    ArrowFunctionDeclaration: inject,
    FunctionDeclaration: inject,
    FunctionExpression: inject
});

// 根據修改的 AST 重新生成代碼
const newCode = escodegen.generate(ast);

fs.writeFileSync('test.js', newCode)

執行上面的代碼后拿到如下結果。

function compute() {
    const start = (() => { return Date.now(); })();
    return ((start) => {console.log(Date.now() - start);})(start);
}

這樣我們就可以拿到每個函數的耗時數據了。但是這種方式是靜態分析源碼，落地起來需要用戶主動操作，并不是那么友好。那么基于這個基礎我們利用 V8 調試協議中的 Debugger Domain 實現動態重寫，這種方式還能重寫 Node.js 內部的 JS 代碼。首先改一下測試代碼。

function compute() {
    // do something
}

setInterval(compute, 1000)

然后再看改寫代碼的邏輯。

const { Session } = require('inspector');
const acorn = require('acorn');
const escodegen = require('escodegen');
const b = require('ast-types').builders;
const walk = require("acorn-walk");
const session = new Session();
session.connect();

require('./test_ast');
// 監聽 JS 代碼解析事件，拿到所有的 JS
session.on('Debugger.scriptParsed', (message) => {
    // 只處理這個文件
    if (message.params.url.indexOf('test_ast') === -1) {
        return;
    }
    // 拿到源碼
    session.post('Debugger.getScriptSource', {scriptId: message.params.scriptId}, (err, ret) => {
        const ast = acorn.parse(ret.scriptSource, {
            ecmaVersion: 'latest',
        });
        function inject(node) {
            const entry = b.variableDeclaration('const', [b.variableDeclarator(b.identifier('start'), b.callExpression(
                b.identifier('(() => { return Date.now(); })'), [],
            ))]);
            const exit = b.returnStatement(b.callExpression(
                b.identifier('((start) => {console.log(Date.now() - start);})'), [ 
                    b.identifier('start')
                ],
            ));

            if (node.body.body) {
                node.body.body.unshift(entry);
                node.body.body.push(exit);
            }
        }
        walk.simple(ast, {
            ArrowFunctionExpression: inject,
            ArrowFunctionDeclaration: inject,
            FunctionDeclaration: inject,
            FunctionExpression: inject
        });
        const newCode = escodegen.generate(ast);
        // 分析完，重寫 AST后生成新的代碼，并重寫
        session.post('Debugger.setScriptSource', {
            scriptId: message.params.scriptId,
            scriptSource: newCode,
            dryRun: false
        });
    })
});

session.post('Debugger.enable', () => {});

正常來說，setInterval 執行的函數沒有東西輸出，但是我們發現會不斷輸出 0，也就是耗時，因為這里使用毫秒級的統計，所以是 0，不過我們不需要關注這個。這樣我們就完成了 hack 用戶的代碼，而對用戶來說是無感的，唯一需要做的事情就是引入我們提供的一個 SDK。不過這種方式的難點在重寫代碼的邏輯，風險也比較大，但是如果我們解決了這個問題后，我們就可以隨便 hack 用戶的代碼，做我們想做的事情，當然，是正事。

責任編輯：姜華來源：編程雜技

JS hack 用戶代碼

成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看

如何優雅地 Hack 用戶的代碼