Go 泛型基準測試：性能更差還是更好？

作者：程序員ug 2022-03-29 11:48:40

在泛型基準測試中，基準測試將測試所有用例中int和float32的減法函數，我添加了第三個選項，推斷數據類型。我還想確定如果我們讓泛型函數將數據類型推斷為int會有怎樣的表現。

Go1.18 已經發布了，泛型終于正式進入了 Go 語言。那泛型將如何影響性能?讓我們通過對幾個用例進行基準測試來弄清楚。

關于 Go1.18 新特性的文章有很多，討論也不少。其中一個討論是我想寫的一個主題，即泛型對性能有什么影響?許多讀者擔心泛型會降低性能，但我的觀點是泛型會提高性能。我的觀點背后的原因是泛型將允許我們在運行時跳過類型轉換、斷言和反射，而是依賴編譯器在編譯時決定這個問題。

在我關于學習泛型[1]的文章中，我解釋了泛型的用法，兩個主要好處是減少了基于數據類型的重復函數并避免了interface{}. 這些是我們將在本文中進行基準測試的用例，以發現更改的性能。

說明下：我不是基準測試專家。我只是一個基準測試菜鳥。在我看來，基準測試非常困難。

為了做出公平的基準測試，我們將為每個用例設置一個測試用例。這將意味著我們將：

使用重復函數進行基準測試
使用泛型進行基準測試
使用使用 interface{} 進行基準測試

準備函數進行基準測

試我們將重用學習泛型[2]中的一些代碼，在其中，我們有一個Subtract函數可以減去三種Subtractable數據類型之間的值。

我們將要確定哪些 Subtract 方法性能最好。可以在 Playground[3] 嘗試一下。

package functions

// Subtract will subtract the second value from the first
func SubtractInt(a, b int) int {
 return a - b
}

// Subtract64 will subtract the second value from the first
func SubtractInt64(a, b int) int {
 return a - b
}

// SubtractFloat32 will subtract the second value from the first
func SubtractFloat32(a, b float32) float32 {
 return a - b
}
// SubtractTypeSwitch is used to subtract using interfaces
func SubtractTypeSwitch(a, b interface{}) interface{} {
 switch a.(type) {
 case int:
  return a.(int) - b.(int)
 case int64:
  return a.(int64) - b.(int64)
 case float32:
  return a.(float32) - b.(float32)
 default:
  return nil
 }
}

// Subtract will subtract the second value from the first
func Subtract[V int64 | int | float32](a, b V "V int64 | int | float32") V {
 return a - b
}

在那里，我們將開始對功能進行基準測試。它們應該相當容易理解，并且我們涵蓋了減法、基于數據類型、類型切換和泛型的可能解決方案。

準備基準測試

創建一個常規的測試文件，我們可以在其中存儲基準，如果你熟悉 Go 中的基準，你可以閱讀這里的教程[4]。

在基準測試的頂部，我將生成兩個切片，一個隨機整數切片，一個隨機 float32 切片。這些隨機切片將用作減法方法的輸入參數。

然后我們創建一個b.Run函數，它會一次觸發一個函數，次數與我們設置為基準測試器的次數一樣多，使用-benchtime標志運行。對于這個基準測試，我將強制基準測試器運行每個函數 1000000000 次。如果你未指定運行函數的次數，則基準測試程序會在特定時間內盡可能多次地運行該函數。這將以它們沒有運行相同數量的操作而告終，我希望它們這樣做。

這就是我最終的基準測試的樣子。

用于執行基準測試以確定泛型性能影響的測試文件。

package functions

import (
 "math/rand"
 "testing"
 "time"
)
// Benchmark_Subtract is used to determine the most performant solution to subtraction
func Benchmark_Subtract(b *testing.B) {
 // Create a slice of random numbers based on the number of iterations set
 // to test the performance of the function
 // Default iterations for me is 1000000000
 // b.N is always 1 so we can use that to set the number of iterations
 numbers := make([]int, 1000000001)
 floatNumbers := make([]float32, 1000000001)
 // Create a random seed

 seed := rand.NewSource(time.Now().UnixNano())
 // Give the seed to the random package
 randomizer := rand.New(seed)
 for i := 0; i < b.N; i++ {
  // randomize numbers between 0-100
  numbers[i] = randomizer.Intn(100)
  floatNumbers[i] = float32(randomizer.Intn(100))
 }
 // run a benchmark for regular Ints
 b.Run("SubtractInt", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   SubtractInt(numbers[i], numbers[i+1])
  }
 })
 // run a benchmark for regular Floats
 b.Run("SubtractFloat", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   SubtractFloat32(floatNumbers[i], floatNumbers[i+1])
  }
 })
 // run a benchmark for TypeSwitched Ints
 b.Run("Type_Subtraction_int", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   SubtractTypeSwitch(numbers[i], numbers[i+1])
  }
 })
 // run a benchmark for TypeSwitched Floats
 b.Run("Type_Subtraction_float", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   SubtractTypeSwitch(floatNumbers[i], floatNumbers[i+1])
  }
 })

 // run a benchmark for Generic Ints
 b.Run("Generic_Subtraction_int", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   Subtract[int](numbers[i], numbers[i+1] "int")
  }
 })
 // run a benchmark for Generic Floats
 b.Run("Generic_Subtraction_float", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   Subtract[float32](floatNumbers[i], floatNumbers[i+1] "float32")
  }
 })
 // run a benchmark where generic type is infered
 b.Run("Generic_Inferred_int", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   Subtract(numbers[i], numbers[i+1])
  }
 })
}

要運行基準測試，請使用以下命令。請注意，該-count 5參數用于將每個基準測試運行 5 次。這是因為如果你運行每個基準測試一次，你可能會得到不公平的結果。

go test -v -bench=Benchmark -benchtime=1000000000x -count 5

分析結果

基準測試將與正在運行的函數的名稱一起輸出，我們可以使用它來識別不同的函數。第二個值是運行的操作數，在我們的例子中，我們將其設置為固定數字，因此所有行都應該顯示相同。

第三個輸出很有趣，它是每次操作的納秒數 (ns/op)。這是顯示函數平均速度的指標。

Go 測試工具的基準測試結果。

goos: windows
goarch: amd64
pkg: programmingpercy/benchgeneric
cpu: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
Benchmark_Subtract
Benchmark_Subtract/SubtractInt
Benchmark_Subtract/SubtractInt-4                1000000000               0.9002 ns/op
Benchmark_Subtract/SubtractInt-4                1000000000               0.8904 ns/op
Benchmark_Subtract/SubtractInt-4                1000000000               0.8277 ns/op
Benchmark_Subtract/SubtractInt-4                1000000000               0.8290 ns/op
Benchmark_Subtract/SubtractInt-4                1000000000               0.8266 ns/op
Benchmark_Subtract/SubtractFloat
Benchmark_Subtract/SubtractFloat-4              1000000000               0.8591 ns/op
Benchmark_Subtract/SubtractFloat-4              1000000000               0.8033 ns/op
Benchmark_Subtract/SubtractFloat-4              1000000000               0.8108 ns/op
Benchmark_Subtract/SubtractFloat-4              1000000000               0.8168 ns/op
Benchmark_Subtract/SubtractFloat-4              1000000000               0.8040 ns/op
Benchmark_Subtract/Type_Subtraction_int
Benchmark_Subtract/Type_Subtraction_int-4               1000000000               1.597 ns/op
Benchmark_Subtract/Type_Subtraction_int-4               1000000000               1.711 ns/op
Benchmark_Subtract/Type_Subtraction_int-4               1000000000               1.607 ns/op
Benchmark_Subtract/Type_Subtraction_int-4               1000000000               1.570 ns/op
Benchmark_Subtract/Type_Subtraction_int-4               1000000000               1.588 ns/op
Benchmark_Subtract/Type_Subtraction_float
Benchmark_Subtract/Type_Subtraction_float-4             1000000000               1.320 ns/op
Benchmark_Subtract/Type_Subtraction_float-4             1000000000               1.311 ns/op
Benchmark_Subtract/Type_Subtraction_float-4             1000000000               1.323 ns/op
Benchmark_Subtract/Type_Subtraction_float-4             1000000000               1.424 ns/op
Benchmark_Subtract/Type_Subtraction_float-4             1000000000               1.321 ns/op
Benchmark_Subtract/Generic_Subtraction_int
Benchmark_Subtract/Generic_Subtraction_int-4            1000000000               0.8251 ns/op
Benchmark_Subtract/Generic_Subtraction_int-4            1000000000               0.8288 ns/op
Benchmark_Subtract/Generic_Subtraction_int-4            1000000000               0.8420 ns/op
Benchmark_Subtract/Generic_Subtraction_int-4            1000000000               0.8377 ns/op
Benchmark_Subtract/Generic_Subtraction_int-4            1000000000               0.8357 ns/op
Benchmark_Subtract/Generic_Subtraction_float
Benchmark_Subtract/Generic_Subtraction_float-4          1000000000               0.7952 ns/op
Benchmark_Subtract/Generic_Subtraction_float-4          1000000000               0.7987 ns/op
Benchmark_Subtract/Generic_Subtraction_float-4          1000000000               0.7877 ns/op
Benchmark_Subtract/Generic_Subtraction_float-4          1000000000               0.8037 ns/op
Benchmark_Subtract/Generic_Subtraction_float-4          1000000000               0.8283 ns/op
Benchmark_Subtract/Generic_Inferred_int
Benchmark_Subtract/Generic_Inferred_int-4               1000000000               0.8297 ns/op
Benchmark_Subtract/Generic_Inferred_int-4               1000000000               0.8283 ns/op
Benchmark_Subtract/Generic_Inferred_int-4               1000000000               0.8319 ns/op
Benchmark_Subtract/Generic_Inferred_int-4               1000000000               0.8366 ns/op
Benchmark_Subtract/Generic_Inferred_int-4               1000000000               0.8623 ns/op
PASS
ok      programmingpercy/benchgeneric   37.114s

從結果中，我們可以確定類型斷言函數要慢得多。它*慢了大約 50-90%*。在這個測試用例中，這似乎很荒謬，因為我們談論的是半納秒。

泛型函數的執行與特定于數據類型的函數大致相同，但速度略有提高。速度的這種小幅提高可能是由于我計算機上運行的其他軟件。以我的心態，我認為編譯器完成其工作后，泛型函數調用應該與常規函數調用相同。

我們可以在結果中看到的另一個要點是int減法比float32減法更耗時。常規int減法的平均速度為 0,85478 ns/op，常規float32減法的平均速度為0,8188 ns/op。這意味著在我的基準測試中，float32減法大約快 5% 。

因此，該基準的關鍵要點是：

根據我的觀點，類型斷言/類型轉換解決方案最慢
泛型和常規數據類型函數的性能相同
Float32減法比int快

以真實場景為基準

讓我們比較一個真實的場景。在用例中，我們有兩個有 Move 的結構Person，Car。這兩個結構都有一個Move接受距離的函數，但是，Person 距離被傳遞為float32 而 Car 接受一個int。

這兩種結構都在同一個工作流中處理，因此我們希望在同一個函數中處理它們。

對此的泛型解決方案是創建泛型結構，我們可以在其中定義要在創建時使用的數據類型。接口解決方案是接受結構作為輸入，并對它們進行類型斷言并轉換正確的數據類型。我們不能為它們提供共享接口，因為數據類型不一樣。

在代碼示例中，有一個泛型和舊類型斷言解決方案的實現，類型斷言帶有后綴Regular，因此我們可以更容易地知道什么與什么解決方案相關。

在具有不同數據類型的Cars和Persons 上執行 Move 的泛型解決方案。

package benchmarking

// Subtractable is a type constraint that defines subtractable datatypes to be used in generic functions
type Subtractable interface {
 int | int64 | float32
}
// Moveable is the interace for moving a Entity
type Moveable[S Subtractable] interface {
 Move(S)
}

// Car is a Generic Struct with the type S to be defined
type Car[S Subtractable] struct {
 Name string
 DistanceMoved S
}

// Person is a Generic Struct with the type S to be defined
type Person[S Subtractable] struct {
 Name string
 DistanceMoved S
}

// Person is a struct that accepts a type definition at initialization
// And uses that Type as the data type for meters as input
func (p *Person[S]) Move(meters S) {
 p.DistanceMoved += meters
}
func (c *Car[S]) Move(meters S) {
 c.DistanceMoved += meters
}

// Move is a generic function that takes in a Generic Moveable and moves it
func Move[S Subtractable, V Moveable[S]](v V, meters S "S Subtractable, V Moveable[S]") {
 v.Move(meters)
}

類型斷言方案的 Move：

package benchmarking

// Below is the Type casting based Solution
//
type CarRegular struct {
 Name          string
 DistanceMoved int
}

type PersonRegular struct {
 Name          string
 DistanceMoved float32
}

func (p *PersonRegular) Move(meters float32) {
 p.DistanceMoved += meters
}

func (c *CarRegular) Move(meters int) {
 c.DistanceMoved += meters
}

func MoveRegular(v interface{}, distance float32) {
 switch v.(type) {
 case *PersonRegular:
  v.(*PersonRegular).Move(distance)
 case *CarRegular:
  v.(*CarRegular).Move(int(distance))
 default:
  // Handle Unsupported types, not needed by Generic solution as Compiler does this for you
 }
}

現在我們已經有了解決方案，是時候開始基準測試了。我將在基準測試之前創建 Persons 和 Cars，我們將測量Move 和MoveRegular 的性能。

package benchmarking

import "testing"

func Benchmark_Structures(b *testing.B) {
 // Init the structs
 p := &Person[float32]{Name: "John"}
 c := &Car[int]{Name: "Ferrari"}

 pRegular := &PersonRegular{Name: "John"}
 cRegular := &CarRegular{Name: "Ferrari"}

 // Run the test
 b.Run("Person_Generic_Move", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   // generic will try to use float64 if we dont tell it is a float32
   Move[float32](p, 10.2 "float32")
  }
 })

 b.Run("Car_Generic_Move", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   Move(c, 10)
  }
 })

 b.Run("Person_Regular_Move", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   MoveRegular(pRegular, 10.2)
  }
 })

 b.Run("Car_Regular_Move", func(b *testing.B) {
  for i := 0; i < b.N; i++ {
   MoveRegular(cRegular, 10)
  }
 })
}

我使用以下命令運行測試：

go test -v -bench=Benchmark_Structures -benchtime=1000000000x -count 5

運行基準測試的結果：

goos: windows
goarch: amd64
pkg: programmingpercy/benchgeneric
cpu: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
Benchmark_Structures
Benchmark_Structures/Person_Generic_Move
Benchmark_Structures/Person_Generic_Move-4              1000000000               4.690 ns/op
Benchmark_Structures/Person_Generic_Move-4              1000000000               4.668 ns/op
Benchmark_Structures/Person_Generic_Move-4              1000000000               4.727 ns/op
Benchmark_Structures/Person_Generic_Move-4              1000000000               4.664 ns/op
Benchmark_Structures/Person_Generic_Move-4              1000000000               4.699 ns/op
Benchmark_Structures/Car_Generic_Move
Benchmark_Structures/Car_Generic_Move-4                 1000000000               3.176 ns/op
Benchmark_Structures/Car_Generic_Move-4                 1000000000               3.188 ns/op
Benchmark_Structures/Car_Generic_Move-4                 1000000000               3.296 ns/op
Benchmark_Structures/Car_Generic_Move-4                 1000000000               3.144 ns/op
Benchmark_Structures/Car_Generic_Move-4                 1000000000               3.156 ns/op
Benchmark_Structures/Person_Regular_Move
Benchmark_Structures/Person_Regular_Move-4              1000000000               4.694 ns/op
Benchmark_Structures/Person_Regular_Move-4              1000000000               4.634 ns/op
Benchmark_Structures/Person_Regular_Move-4              1000000000               4.677 ns/op
Benchmark_Structures/Person_Regular_Move-4              1000000000               4.660 ns/op
Benchmark_Structures/Person_Regular_Move-4              1000000000               4.626 ns/op
Benchmark_Structures/Car_Regular_Move
Benchmark_Structures/Car_Regular_Move-4                 1000000000               2.560 ns/op
Benchmark_Structures/Car_Regular_Move-4                 1000000000               2.555 ns/op
Benchmark_Structures/Car_Regular_Move-4                 1000000000               2.553 ns/op
Benchmark_Structures/Car_Regular_Move-4                 1000000000               2.579 ns/op
Benchmark_Structures/Car_Regular_Move-4                 1000000000               2.560 ns/op
PASS
ok      programmingpercy/benchgeneric   75.830s

看到類型斷言解決方案比泛型解決方案更快，我有點驚訝。我確保多次運行的基準測試，它不是偶然的。

我們可以從基準中看到，基于 Cars 的 Int 解決方案都比基于 Person 的 float32 的更快。

Person move 方法具有相同的性能，無論是泛型解決方案還是常規解決方案。但是，你可以看到 Cars 的不同之處，類型斷言的 Cars 是最快的。類型斷言執行比泛型快 20%。

因此，該基準的關鍵要點如下。

基于浮點的類型具有相同的性能，而類型斷言的整數 cars 速度更快，這不是我的觀點。
Float32 加法比 int 慢。

結論

所以，我們現在已經測試了一些我可以看到泛型有用的用例。

老實說，我確實希望第二個基準也能證明泛型更快。這將進一步證明我的說法，即泛型由于是在編譯時而不是運行時決定的，因此性能更高。

通過使用泛型或特定于數據類型的函數，我們可以在第一個用例中看到相當大的性能提升。我知道幾納秒可能看起來很荒謬，但是在某些用例中，這些類型的極端優化很重要。我曾經做過一個高性能的網絡嗅探器，它必須實時處理大量的網絡數據。編寫這樣的軟件將需要所有的優化。

我們已經看到，選擇正確的數據類型會對性能產生很大影響。但是，我認為我們可以說，那些表示擔心泛型會拖慢軟件速度的讀者可以冷靜下來。從好的方面來說，我看到泛型解決方案允許我們更輕松地交換數據類型，從而提高性能。

另一方面，Go 中的類型斷言和類型轉換似乎具有超強的性能。

正如我們所看到的，許多因素都會對結果產生影響，例如使用的算術運算符[5]、數據類型等。在我的基準測試中可能會出現我不知道的錯誤。

原文鏈接：https://programmingpercy.tech/blog/benchmarking-generics-in-go

參考資料

[1]學習泛型: https://programmingpercy.tech/blog/learning-generics-in-go

[2]學習泛型: https://programmingpercy.tech/blog/learning-generics-in-go

[3]Playground: https://go.dev/play/p/BLU8pHOzmvS

[4]教程: https://betterprogramming.pub/we-measure-the-power-of-cars-computers-and-cellphones-but-what-about-code-91ed5583f298

[5]算術運算符: https://www.techopedia.com/definition/25582/arithmetic-operator

責任編輯：武曉燕來源：幽鬼

Go 泛型測試

成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看