成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看

使用FastAPI和Redis Caching加快機器學習模型推理

譯文 精選
人工智能 機器學習 Redis
這篇指南逐步介紹了通過緩存請求并生成快速響應以加快模型推理。?

譯者 | 布加迪

審校 | 重樓

Redis 是一款開源內存數據結構存儲系統,是機器學習應用領域中緩存的優選。它的速度、耐用性以及支持各種數據結構使其成為滿足實時推理任務的高吞吐量需求的理想選擇。

我們在本教程中將探討Redis緩存在機器學習工作流程中的重要性。我們將演示如何使用FastAPI和Redis構建一個強大的機器學習應用程序。本教程介紹如何在Windows上安裝Redis、在本地運行Redis以及如何將其集成到機器學習項目中。最后,我們將通過發送重復請求和獨特請求來測試該應用程序,以驗證Redis緩存系統正常運行。

為什么在機器學習中使用Redis緩存?

在當今快節奏的數字環境中,用戶期望機器學習應用程序能夠立即獲得結果。比如說,使用推薦模型向用戶推薦產品的電商平臺。如果實施Redis來緩存重復請求,該平臺就可以顯著縮短響應時間。

當用戶請求產品推薦時,系統先檢查該請求是否已被緩存。如果已緩存,則在幾微秒內返回緩存的響應,從而提供無縫的體驗。如果沒有緩存,模型就處理該請求,生成推薦,并將結果存儲在Redis中供將來的請求使用。這種方法不僅提高了用戶滿意度,還優化了服務器資源,使模型能夠高效地處理更多請求。

使用Redis構建網絡釣魚電子郵件分類應用程序

我們在本項目中將構建一個網絡釣魚電子郵件分類應用程序。整個過程包括加載和處理來自Kaggle的數據集,使用處理后的數據訓練機器學習模型,評估其性能,保存經過訓練的模型,最構建帶有Redis集成機制FastAPI應用程序。

1. 設置

pip install redis
  • 如果使用Windows系統,且未安裝Windows Subsystem for Linux(WSL,請按照微軟指南啟用WSL,并微軟商店安裝Linux發行版(比如Ubuntu)。
  • WSL設置完成后,打開WSL終端并執行以下命令安裝Redis
sudo apt update
sudo apt install redis-server
  • 要啟動Redis服務器,請運行:
sudo service redis-server start

應該會看到一條確認消息,表明redis-server已成功啟動。

2. 模型訓練

訓練腳本加載數據集、處理數據、訓練模型并將其保存在本地。

import joblib
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline

def main():
 # Load dataset
 df = pd.read_csv("data/Phishing_Email.csv") # adjust the path as necessary

 # Assume dataset has columns "text" and "label"
 X = df["Email Text"].fillna("")
 y = df["Email Type"]

 # Split the dataset into training and testing sets
 X_train, X_test, y_train, y_test = train_test_split(
 X, y, test_size=0.2, random_state=42
 )

 # Create a pipeline with TF-IDF and Logistic Regression
 pipeline = Pipeline(
 [
 ("tfidf", TfidfVectorizer(stop_words="english")),
 ("clf", LogisticRegression(solver="liblinear")),
 ]
 )

 # Train the model
 pipeline.fit(X_train, y_train)

 # Save the trained model to a file
 joblib.dump(pipeline, "phishing_model.pkl")
 print("Model trained and saved as phishing_model.pkl")

if __name__ == "__main__":
 main()


python train.py


Model trained and saved as phishing_model.pkl

3. 模型評估

評估腳本加載數據集和保存的模型文件以執行模型評估。

import pandas as pd
from sklearn.metrics import classification_report, accuracy_score
from sklearn.model_selection import train_test_split
import joblib

def main():
 # Load dataset
 df = pd.read_csv("data/Phishing_Email.csv") # adjust the path as necessary

 # Assume dataset has columns "text" and "label"
 X = df["Email Text"].fillna("")
 y = df["Email Type"]

 # Split the dataset
 X_train, X_test, y_train, y_test = train_test_split(
 X, y, test_size=0.2, random_state=42
 )

 # Load the trained model
 model = joblib.load("phishing_model.pkl")

 # Make predictions on the test set
 y_pred = model.predict(X_test)

 # Evaluate the model
 print("Accuracy: ", accuracy_score(y_test, y_pred))
 print("Classification Report:")
 print(classification_report(y_test, y_pred))

if __name__ == "__main__":
 main()

結果近乎完美,F1分數也非常出色。

python validate.py

Accuracy: 0.9723860589812332
Classification Report:
 precision recall   f1-score support

Phishing Email 0.96 0.97 0.96 1457
 Safe Email 0.98 0.97 0.98 2273

 accuracy 0.97 3730
 macro avg 0.97 0.97 0.97 3730
 weighted avg   0.97 0.97 0.97 3730

4. 使用Redis提供模型服務

為了提供模型服務,我們將使用FastAPI創建REST API并集成Redis緩存預測。

import asyncio
import json
import joblib
from fastapi import FastAPI
from pydantic import BaseModel
import redis.asyncio as redis

# Create an asynchronous Redis client (make sure Redis is running on localhost:6379)
redis_client = redis.Redis(host="localhost", port=6379, db=0, decode_respnotallow=True)

# Load the trained model (synchronously)
model = joblib.load("phishing_model.pkl")

app = FastAPI()

# Define the request and response data models
class PredictionRequest(BaseModel):
 text: str

class PredictionResponse(BaseModel):
 prediction: str
 probability: float

@app.post("/predict", response_model=PredictionResponse)
async def predict_email(data: PredictionRequest):
 # Use the email text as a cache key
 cache_key = f"prediction:{data.text}"
 cached = await redis_client.get(cache_key)
 if cached:
 return json.loads(cached)

 # Run model inference in a thread to avoid blocking the event loop
 pred = await asyncio.to_thread(model.predict, [data.text])
 prob = await asyncio.to_thread(lambda: model.predict_proba([data.text])[0].max())

 result = {"prediction": str(pred[0]), "probability": float(prob)}

 # Cache the result for 1 hour (3600 seconds)
 await redis_client.setex(cache_key, 3600, json.dumps(result))
 return result

if __name__ == "__main__":
 import uvicorn
 uvicorn.run(app, host="0.0.0.0", port=8000)

python serve.py

INFO: Started server process [17640]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

可以通過訪問URL查看REST API 文檔。

本項目的源代碼、配置文件、模型和數據集可在kingabzpro/Redis-ml-project GitHub代碼庫中找到。如果在運行上述代碼時遇到任何問題,隨時參閱

Redis緩存在機器學習應用中的工作原理

下面逐步解釋Redis緩存在我們的機器學習應用程序中的運作方式,一張流程圖加以說明:

  • 客戶程序提交輸入數據,請求機器學習模型進行預測。
  • 系統根據輸入數據生成獨特的標識符,以檢查預測是否已存在。
  • 系統使用生成的鍵查詢Redis緩存,以查找先前存儲的預測。

A.如果找到緩存的預測,則檢索該預測并以JSON響應的形式返回。

B.如果沒有找到緩存的預測,則將輸入數據傳遞給機器學習模型以生成新的預測。

  • 新生成的預測存儲在Redis緩存中將來使用。
  • 最終結果以JSON格式返回給客戶程序

測試網絡釣魚電子郵件分類應用程序

構建完網絡釣魚電子郵件分類應用程序后,就可以測試其功能了。我們在本節中使用 `cURL` 命令發送多封電子郵件并分析響應來評估該應用程序。此外,我們將驗證Redis數據庫,以確保緩存系統正常運行。

使用CURL命令測試 API

為了測試API,我們將向`/predict`端點發送五個請求。其中三個請求包含獨特的電子郵件文本,另外兩個請求是之前發送的電子郵件的復制版本。這將使我們能夠驗證預測準確性和緩存機制。

echo "\n===== Testing API Endpoint with 5 Requests =====\n"

# First unique email
echo "\n----- Request 1 (First unique email) -----"
curl -X 'POST' \
 'http://localhost:8000/predict' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
 "text": "todays floor meeting you may get a few pointed questions about today article about lays potential severance of $ 80 mm"
}'

# Second unique email
echo "\n\n----- Request 2 (Second unique email) -----"
curl -X 'POST' \
 'http://localhost:8000/predict' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
 "text": "urgent action required: your account has been compromised, click here to reset your password immediately"
}'

# First duplicate (same as first email)
echo "\n\n----- Request 3 (Duplicate of first email - should be cached) -----"
curl -X 'POST' \
 'http://localhost:8000/predict' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
 "text": "todays floor meeting you may get a few pointed questions about today article about lays potential severance of $ 80 mm"
}'

# Third unique email
echo "\n\n----- Request 4 (Third unique email) -----"
curl -X 'POST' \
 'http://localhost:8000/predict' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
 "text": "congratulations you have won a free iphone, click here to claim your prize now before it expires"
}'

# Second duplicate (same as second email)
echo "\n\n----- Request 5 (Duplicate of second email - should be cached) -----"
curl -X 'POST' \
 'http://localhost:8000/predict' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
 "text": "urgent action required: your account has been compromised, click here to reset your password immediately"
}'

echo "\n\n===== Test Complete =====\n"
echo "Now run 'python check_redis.py' to verify the Redis cache entries"

運行上述腳本時,API應該返回每封電子郵件的預測結果。對于重復的請求,響應應該從Redis緩存中加以檢索,以確保更快的響應時間。

sh test.sh


\n===== Testing API Endpoint with 5 Requests =====\n
\n----- Request 1 (First unique email) -----
{"prediction":"Safe Email","probability":0.7791625553383463}\n\n----- Request 2 (Second unique email) -----
{"prediction":"Phishing Email","probability":0.8895319031315131}\n\n----- Request 3 (Duplicate of first email - should be cached) -----
{"prediction":"Safe Email","probability":0.7791625553383463}\n\n----- Request 4 (Third unique email) -----
{"prediction":"Phishing Email","probability":0.9169092144856761}\n\n----- Request 5 (Duplicate of second email - should be cached) -----
{"prediction":"Phishing Email","probability":0.8895319031315131}\n\n===== Test Complete =====\n
Now run 'python check_redis.py' to verify the Redis cache entries

驗證Redis緩存

為了確認緩存系統正常運行,我們將使用Python腳本`check_redis.py`檢查Redis數據庫。該腳本檢索緩存的預測結果并將其以表格形式顯示出來。

import redis
import json
from tabulate import tabulate

def main():
 # Connect to Redis (ensure Redis is running on localhost:6379)
 redis_client = redis.Redis(host="localhost", port=6379, db=0, decode_respnotallow=True)

 # Retrieve all keys that start with "prediction:"
 keys = redis_client.keys("prediction:*")
 total_entries = len(keys)
 print(f"Total number of cached prediction entries: {total_entries}\n")

 table_data = []
 # Process only the first 5 entries
 for key in keys[:5]:
 # Remove the 'prediction:' prefix to get the original email text
 email_text = key.replace("prediction:", "", 1)

 # Retrieve the cached value
 value = redis_client.get(key)
 try:
 data = json.loads(value)
 except json.JSONDecodeError:
 data = {}

 prediction = data.get("prediction", "N/A")

 # Display only the first 7 words of the email text
 words = email_text.split()
 truncated_text = " ".join(words[:7]) + ("..." if len(words) > 7 else "")

 table_data.append([truncated_text, prediction])

 # Print table using tabulate (only two columns now)
 headers = ["Email Text (First 7 Words)", "Prediction"]
 print(tabulate(table_data, headers=headers, tablefmt="pretty"))

if __name__ == "__main__":
 main()

運行check_redis.py腳本時,它會以表格形式顯示緩存條目數量和已緩存的預測結果。

python check_redis.py


Total number of cached prediction entries: 3

+--------------------------------------------------+----------------+
| Email Text (First 7 Words) | Prediction | 
+--------------------------------------------------+----------------+
| congratulations you have won a free iphone,... | Phishing Email |
| urgent action required: your account has been... | Phishing Email |
| todays floor meeting you may get a... | Safe Email |
+--------------------------------------------------+----------------+

結語

通過使用多個請求測試釣魚郵件分類應用程序,我們成功地演示了該API能夠準確識別釣魚郵件,同時還能使用Redis高效地緩存重復請求。這種緩存機制通過減少重復輸入的冗余計算顯著提升了性能,這在API處理龐大流量的實際應用場景中尤其大有助益

雖然這是一個比較簡單的機器學習模型,但在處理更龐大、更復雜的模型(比如圖像識別)時,緩存的優勢來得明顯。比如說,如果在部署一個大規模圖像分類模型,緩存頻繁處理輸入的預測結果可以節省大量計算資源,并顯著縮短響應時間。

原文標題:Accelerate Machine Learning Model Serving with FastAPI and Redis Caching作者:Abid Ali Awan

責任編輯:姜華 來源: 51CTO
相關推薦

2025-06-16 07:55:29

2024-07-30 08:38:13

2022-04-11 15:40:34

機器學習研究推理

2024-09-09 11:45:15

ONNX部署模型

2018-12-06 10:07:49

微軟機器學習開源

2017-07-07 14:41:13

機器學習神經網絡JavaScript

2021-11-02 09:40:50

TensorFlow機器學習人工智能

2023-11-19 23:36:50

2019-10-23 08:00:00

Flask機器學習人工智能

2024-10-12 08:00:00

機器學習Docker

2025-03-04 08:00:00

機器學習Rust開發

2025-03-05 00:22:00

2022-09-07 08:00:00

機器學習MLFlow工具

2024-10-31 13:56:30

FastAPIGradioDjango

2023-11-06 10:50:35

機器學習LIME

2025-02-24 08:00:00

機器學習ML架構

2023-01-05 09:33:37

視覺模型訓練

2018-11-07 09:00:00

機器學習模型Amazon Sage

2023-12-05 15:44:46

計算機視覺FastAPI

2024-03-26 09:11:13

TensorFlow深度學習Pipeline
點贊
收藏

51CTO技術棧公眾號

主站蜘蛛池模板: 中文字幕视频在线观看免费 | 精品久久久久久久久久久久久久 | 中文字幕久久久 | 91久操网 | 欧美一区二区三区,视频 | 一区影院| 国产成人精品a视频一区www | 日本不卡一区 | 久久av在线播放 | 国产日韩欧美在线观看 | 久久成人18免费网站 | 午夜激情在线 | 欧美一a一片一级一片 | 可以免费观看的av片 | 99国产精品99久久久久久粉嫩 | aa级毛片毛片免费观看久 | 99re99| 日本 欧美 国产 | 999久久久 | 久久精品国产99国产精品 | 一区二区国产在线观看 | 国产91丝袜在线18 | 精品欧美一区二区三区久久久 | 国产精品久久久久久久久久久免费看 | 亚洲国产精品一区二区三区 | 国产精品99久久久久久www | 日韩在线中文字幕 | 国产精品a一区二区三区网址 | 亚洲精选一区 | 激情三区 | 盗摄精品av一区二区三区 | 久久久91 | 春色av | 97碰碰碰| 欧美激情精品久久久久 | 成人国产精品久久 | 99免费视频 | 欧美日韩亚洲系列 | 丁香色婷婷 | 亚洲精品视频免费观看 | 国产视频精品在线 |