모델 서빙 관리 도구 SELDON-CORE

[Interactive AI서비스팀 김민석]

다양한 규모의 서비스를 개발하고 운영하면서 점점 더 많은 머신러닝 모델을 서빙하게 됩니다. 이 과정에서 기존 모델을 변경할 때는 해당 모델을 사용하는 서비스와 시스템 내의 관련 구성 요소를 정확히 파악하고 수정해야 합니다. 또한, 모델의 버전 관리, 성능 모니터링, 배포 전략, A/B 테스트 등 다양한 측면의 종합적인 관리가 필요합니다. 이러한 복잡한 작업을 효과적으로 관리할 수 있는 도구인 Seldon Core에 대해 공유하고자 합니다.

모델 통합 관리

Seldon Core는 모델을 컨테이너 화 하여 이미지를 통해서 배포를 해야합니다.

간단한 샘플 예제를 통해서 어떻게 Seldon Core 를 사용할 수 있는지 알아보겠습니다.

train.py

import pickle
from sklearn import datasets
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

# Iris 데이터셋 로드
iris = datasets.load_iris()
X = iris.data
y = iris.target

# 데이터 분할
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# 모델 파이프라인 생성 (스케일링 + 로지스틱 회귀)
model = make_pipeline(StandardScaler(), LogisticRegression(max_iter=200))

# 모델 학습
model.fit(X_train, y_train)

# 모델 저장
with open('model.pickle', 'wb') as f:
    pickle.dump(model, f)

Model.py

import pickle
class Model:
    def __init__(self):
        with open('model.pickle', 'rb') as f:
            self._model = pickle.load(f)

    def predict(self, X):
        output = self._model(X)
        return output

$ s2i build . seldonio/seldon-core-s2i-python3:0.18 sklearn_iris:0.1 --env MODEL_NAME=Model --env SERVICE_TYPE=MODEL --env PERSISTENCE=0

$ kubectl apply -f - << END
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: iris-model
  namespace: seldon-system
spec:
  name: iris
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - name: classifier
          image: sklearn_iris:0.1
    graph:
      name: classifier
    name: default
    replicas: 1
END

$ kubens seldon-system
$ kubectl get svc

현재 iris-model이라는 이름으로 생성된 모델을 확인할 수 있습니다. 현재는 scikit-learn 라이브러리를 사용한 커스텀 모델을 배포하였지만, Triton, TensorRT, ONNX와 같은 다양한 프레임워크와 라이브러리들도 사용 가능하여 대부분의 머신러닝 모델을 관리할 수 있을 것입니다.

https://docs.seldon.io/projects/seldon-core/en/latest/workflow/github-readme.html#seldon-core-v2-now-available
https://docs.seldon.io/projects/seldon-core/en/latest/examples/triton_gpt2_example.html

성능 모니터링

prometheus 나 Kafka를 통한 ELK 로깅 처리 등을 통해서 서빙 API 에 대한 모니터링이 가능합니다.

AB 테스트

Seldon – Iter8 Experiment over Single Seldon Deployment
하나의 Seldon Deployment에서 여러 가지 모델 또는 설정을 테스트

2. Seldon/Iter8 Experiment over Separate Seldon Deployments

각 테스트 버전이 별도로 Seldon Deployment로 관리되며 개별 배포 간에 트래픽을 조정하고 비교할 수 있습니다.

https://docs.seldon.io/projects/seldon-core/en/latest/rollouts/abtests.html

그 외에도 seldon core는 ML 서빙부분에 최적화 하여 다양한 기능들을 제공하고 있어 kubernetes 환경에서 ML 서비스를 구축하고 있다면 고려해볼 만한 기술이라고 생각합니다.

모델 서빙 관리 도구 SELDON-CORE

相关文章