Git內部原理之Git對象存儲

作者：彭金金 2018-07-27 10:39:13

在Git內部原理之Git對象哈希中，講解了Git對象hash的原理，接下來的這篇文章講一講Git對象如何存儲。

原理

數據對象、樹對象和提交對象都是存儲在.git/objects目錄下，目錄的結構如下：

.git 
|-- objects 
    |-- 01 
    |   |-- 55eb4229851634a0f03eb265b69f5a2d56f341 
    |-- 1f 
    |   |-- 7a7a472abf3dd9643fd615f6da379c4acb3e3a 
    |-- 83 
        |-- baae61804e65cc73a7201a7252750c76066a30

從上面的目錄結構可以看出，Git對象的40位hash分為兩部分：頭兩位作為文件夾，后38位作為對象文件名。所以一個Git對象的存儲路徑規則為：

.git/objects/hash[0, 2]/hash[2, 40]

這里就產生了一個疑問：為什么Git要這么設計目錄結構，而不直接用Git對象的40位hash作為文件名？原因是有兩點：

有些文件系統對目錄下的文件數量有限制。例如，FAT32限制單目錄下的***文件數量是65535個，如果使用U盤拷貝Git文件就可能出現問題。
有些文件系統訪問文件是一個線性查找的過程，目錄下的文件越多，訪問越慢。

在Git內部原理之Git對象哈希中，我們知道Git對象會在原內容前加個一個頭部：

store = header + content

Git對象在存儲前，會使用zlib的deflate算法進行壓縮，即簡要描述為：

zlib_store = zlib.deflate(store)

壓縮后的zlib_store按照Git對象的路徑規則存儲到.git/objects目錄下。

總結下Git對象存儲的算法步驟：

計算content長度，構造header;
將header添加到content前面，構造Git對象；
使用sha1算法計算Git對象的40位hash碼；
使用zlib的deflate算法壓縮Git對象；
將壓縮后的Git對象存儲到.git/objects/hash[0, 2]/hash[2, 40]路徑下;

Nodejs實現

接下來，我們使用Nodejs來實現git hash-object -w的功能，即計算Git對象的hash值并存儲到Git文件系統中：

const fs = require('fs') 
const crypto = require('crypto') 
const zlib = require('zlib') 
function gitHashObject(content, type) { 
  // 構造header 
  const header = `${type} ${Buffer.from(content).length}\0` 
  // 構造Git對象 
  const store = Buffer.concat([Buffer.from(header), Buffer.from(content)]) 
  // 計算hash 
  const sha1 = crypto.createHash('sha1') 
  sha1.update(store) 
  const hash = sha1.digest('hex') 
  // 壓縮Git對象 
  const zlib_store = zlib.deflateSync(store) 
  // 存儲Git對象 
  fs.mkdirSync(`.git/objects/${hash.substring(0, 2)}`) 
  fs.writeFileSync(`.git/objects/${hash.substring(0, 2)}/${hash.substring(2, 40)}`, zlib_store) 
  console.log(hash) 
} 
// 調用入口 
gitHashObject(process.argv[2], process.argv[3])

***，測試下能否正確存儲Git對象：

$ node index.js 'hello, world' blob 
8c01d89ae06311834ee4b1fab2f0414d35f01102 
$ git cat-file -p 8c01d89ae06311834ee4b1fab2f0414d35f01102 
hello, world

由此可見，我們生成了一個合法的Git數據對象，證明算法是正確的。

責任編輯：武曉燕來源： jingsam

對象存儲 Git

成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看

Git內部原理之Git對象存儲