架構治理調研：規則、表達式還有語言

2022-05-07 12:21:14

為了在 ArchGuard 中完善分布式規范的能力，便分析了幾個現有的工具。

我們談論到了 “分布式” 場景下，對于架構治理和規范治理的一系列問題。我們提及了一系列的工具，如 API Linter 工具 Spectral，數據庫 Linter 工具 SQLFluff。而為了在 ArchGuard 中完善分布式規范的能力，便分析了幾個現有的工具。

對于我們來說，構建一個類似的工具，需要考慮的一些因素有：

插件化。開發人員可以根據已有的守護規則，開發一些新的架構守護規則，如針對于 API 的，針對于數據庫調用鏈路的。
可測試性。如果采用的是完全 DSL 或者半 DSL，那么如何讓后續的
語言無關。如何不綁定于語言的語法樹，而實現對于多種語言的支持。

出于這個目的，只好拿起現有的代碼進行一番分析，主要有四個工具，適用于 Kotlin 語言的 KtLint、適用于 OpenAPI 的 Spectral、適用于多數據庫的 SQLFluff，以及被諸如 MyBatis 采用的表達式語言 Ognl。

Kotlin 代碼的治理：KtLint

KtLint 與一般的 Lint 工具稍有區別的是，它自帶了一個自動格式化的功能。KtLint 整體的邏輯還是比較簡單的，基于單個文件進行 AST 生成，隨后針對于 AST 進行規則匹配。Ktlint 圍繞于 Rule、Rulesets、RulesetsProvider 構建了規則的層級關系，同時用 Vistor （即 VisitorProvider）模式圍繞 AST 進行分析，如下是 KtLint 的抽象 Rule：

/**
     * This method is going to be executed for each node in AST (in DFS fashion).
     *
     * @param node AST node
     * @param autoCorrect indicates whether rule should attempt auto-correction
     * @param emit a way for rule to notify about a violation (lint error)
     */
    abstract fun visit(
        node: ASTNode,
        autoCorrect: Boolean,
        emit: (offset: Int, errorMessage: String, canBeAutoCorrected: Boolean) -> Unit
    )

如注釋中所說的，三個參數代表了各自的用途。這里的 ASTNode 是來源于 Kotlin 的 AST 樹（ kotlin-compiler-embeddable 包）。模式上也是獲取配置，然后運行檢測規則：

val ruleSets = ruleSetProviders.map { it.value.get() }
val visitorProvider = VisitorProvider(ruleSets, debug)

其中對應的 visit：

visitorProvider
            .visitor(
                params.ruleSets,
                preparedCode.rootNode,
                concurrent = false
            ).invoke { node, rule, fqRuleId ->    }

在 VistorProvider 中會過濾對應的規則：

val enabledRuleReferences =
            ruleReferences
                .filter { ruleReference -> isNotDisabled(rootNode, ruleReference.toQualifiedRuleId()) }
        val enabledQualifiedRuleIds = enabledRuleReferences.map { it.toQualifiedRuleId() }
        val enabledRules = ruleSets
            .flatMap { ruleSet ->
                ruleSet
                    .rules
                    .filter { rule -> toQualifiedRuleId(ruleSet.id, rule.id) in enabledQualifiedRuleIds }
                    .filter { rule -> isNotDisabled(rootNode, toQualifiedRuleId(ruleSet.id, rule.id)) }
                    .map { rule -> "${ruleSet.id}:${rule.id}" to rule }
            }.toMap()
....

然后，再去并行或者串行地運行 Rule 里的 visit。

而對于規則的方式是通過 ServicesLoader 進行的插件化方式：

private fun getRuleSetProvidersByUrl(
    url: URL?,
    debug: Boolean
): Pair<URL?, List<RuleSetProvider>> {
    if (url != null && debug) {
        logger.debug { "JAR ruleset provided with path \"${url.path}\"" }
    }
    val ruleSetProviders = ServiceLoader.load(
        RuleSetProvider::class.java,
        URLClassLoader(listOfNotNull(url).toTypedArray())
    ).toList()
    return url to ruleSetProviders.toList()
}

如果粒度更大的情況下，采用 Java 9 的模塊是不是會更加方便？

基于 API 數據的 Spectral

與 Ktlint 不同的是 Spectral 是一個針對于 JSON/YAML Lint 的工具，特別是針對于 OpenAPI 文檔（就是 swagger 的 yaml/json 文件）。與 Ktlint 相比，Spectral 最有趣的地方是，它提供了一個 JSON Path（類似于 XPath）的功能，可以針對于對象中的特定部分，進采用特定的規則。如下是 Spectral 的示例：

'oas3-valid-schema-example': {
  description: 'Examples must be valid against their defined schema.',
  message: '{{error}}',
  severity: 0,
  formats: [oas3],
  recommended: true,
  type: 'validation',
  given: [
    "$.components.schemas..[?(@property !== 'properties' && @ && (@ && @.example !== void 0 || @.default !== void 0) && (@.enum || @.type || @.format || @.$ref || @.properties || @.items))]",
    "$..content..[?(@property !== 'properties' && @ && (@ && @.example !== void 0 || @.default !== void 0) && (@.enum || @.type || @.format || @.$ref || @.properties || @.items))]",
    "$..headers..[?(@property !== 'properties' && @ && (@ && @.example !== void 0 || @.default !== void 0) && (@.enum || @.type || @.format || @.$ref || @.properties || @.items))]",
    "$..parameters..[?(@property !== 'properties' && @ && (@ && @.example !== void 0 || @.default !== void 0) && (@.enum || @.type || @.format || @.$ref || @.properties || @.items))]",
  ],
  then: {
    function: oasExample,
    functionOptions: {
      schemaField: '$',
      oasVersion: 3,
      type: 'schema',
    },
  },
}

上面對象中的 given 即是針對于對象中的相關屬性作為條件，執行后面的 then 函數，詳細可以見官方的文檔：《 Custom Rulesets 》。順帶一提：Spectral 采用的是 nimma 作為 JSON Path 表達式。

Spectral 的模型

與 Ktlint 相比，由于 Spectral 是與 OpenAPI/Async API 進行了相關的綁定，加上特定的規則表達式，所以其數據模型稍微復雜一些。其數據模型包含了：描述，消息級別，given - then，上下文。如下所示：

recommended。是否是推薦配置。
enabled。是否允許
description。規則描述
message。錯誤信息
documentationUrl。文檔地址。
severity。嚴重程度，`error`, `warn`, `info`, or `hint`。
formats。格式化標準，如 OpenAPI 2.0、OpenAPI 3.0 等。
resolved。是否已解決。
given。類似于 CSS 中的選擇器，使用類似于 XPath 的 JsonPath， JSONPath
then。
field，字段
function，函數，模式
functionOptions

此外，它還有一個簡單的類型系統，以及對應的表達式判斷。如下：

CASES。flat、camel、pascal、kebab、cobol、snake、macro
長度：最大值、最小值。
數字
Boolean 判斷。
類型系統。枚舉

總的來說，Spectral 在實現上比較靈活有趣。

SQLFluff

與 Ktlint 和 Spectral 這種基于已有的數據模型的應用來說，SQLFluff 顯得更有挑戰性 —— 它是基于多種不同的數據庫方言來構建規則的。SQLFluff 是直接基于源碼來進行分析的，將不同的數據庫方言轉換為基本元素（分詞）。隨后，基于分詞的類型 + 規則，來對它們進行處理。簡單來說，就是更抽象的分詞上下文，構建對應的規則上下文。如下是

segement。位于其核心的是 BaseSegment，它定義了 Lexing、Parsing 和 Linting 三個基本的元素，產生諸如： groupby_clause 、 orderby_clause 、 select_clause 等分詞。
parent_stack。
siblings_pre。
siblings_post。
raw_stack。
memory。
dialect。作為語法運行時解析的基礎。
path。路徑。
templated_file。模板文件。

示例：

{
    "file": {
        "statement": {
            "select_statement": {
                "select_clause": {
                    "keyword": "SELECT",
                    "whitespace": " ",
                    "select_clause_element": {
                        "column_reference": {
                            "identifier": "foo"
                        }
                    }
                },
                "whitespace": " ",
                "from_clause": {
                    "keyword": "FROM",
                    "whitespace": " ",
                    "from_expression": {
                        "from_expression_element": {
                            "table_expression": {
                                "table_reference": {
                                    "identifier": "bar"
                                }
                            }
                        }
                    }
                }
            }
        },
        "statement_terminator": ";",
        "newline": "\n"
    }
}

隨后的規則，便是在對這些規則進行 eval ，如下示例：

class Rule_L021(BaseRule):
    def _eval(self, context: RuleContext) -> Optional[LintResult]:
        """Ambiguous use of DISTINCT in select statement with GROUP BY."""
        segment = context.functional.segment
        if (
            segment.all(sp.is_type("select_statement"))
            # Do we have a group by clause
            and segment.children(sp.is_type("groupby_clause"))
        ):
            # Do we have the "DISTINCT" keyword in the select clause
            distinct = (
                segment.children(sp.is_type("select_clause"))
                .children(sp.is_type("select_clause_modifier"))
                .children(sp.is_type("keyword"))
                .select(sp.is_name("distinct"))
            )
            if distinct:
                return LintResult(anchor=distinct[0])
        return None

在這里所有的規則判斷都是基于這種抽象的語法樹。從某種意義上來說，構建了一個統一的抽象。本來想進一步分析，但是發現各種 SQL dialect 里是各種正則表達式，我就選擇了臨時性撤退。

表達式語言：OGNL

起初，我是在實現 ArchGuard Scanner 對于 MyBatis 的 SQL 生成支持時，看到了 XML 中嵌套的 OGNL 表達式，發現了 OGNL。從實現上來說，它比我之前設想的 TreeSitter 中的 S 表達式，在與數據結合的完善度上更高。同樣，也可以用于這里的規則判斷，可以用表達式來對數據進行匹配。

對象導航圖語言（Object Graph Navigation Language），簡稱 OGNL，是應用于 Java 中的一個開源的表達式語言（Expression Language），用于獲取和設置 Java 對象的屬性，以及其他附加功能，例如列表投影（projection）和選擇以及 lambda 表達式。您可以使用相同的表達式來獲取和設置屬性的值。Ognl 類包含了評估 OGNL 表達式快捷方式。它可以分兩個階段執行此操作，將表達式解析為內部形式，然后使用該內部形式設置或獲取屬性的值；或者可以在一個階段完成，并直接使用表達式的字符串形式獲取或設置屬性。

Ognl.getValue("name='jerry'", oc, oc.getRoot());
String name2 = (String) Ognl.getValue("#user1.name='jack',#user1.name", oc, oc.getRoot());

本來想模仿 OGNL 編寫一個表達式語言，但是發現使用的是 Jacc，也沒有 Antlr 實現。所以，在尋找一種更合理的方式。

結論

作為相關工具的分析，這里先開個頭。

責任編輯：張燕妮來源： Phodal全棧工程師

工具分布式

成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看