LangChain: If you know ChatGPT, without question

[가상생명연구팀 황준선]

ChatGPT와 Bard 등, 요즘 대화형태의 대형 언어 모델(LLM)이 우후죽순 발표되고 있다. 하지만, LLM만 있다면 학습한 데이터 안에서만 적절한 문장을 생성해낼 것이다. 그래서 Bard는 구글 검색 엔진을 추가하여 최근 데이터를 활용한 답변을 생성하는 강점을 가지고 있다. 이것을 우리도 할 수 없을까? 만약 현재 ChatGPT와 Bard를 모두 써보고 자신의 응용 프로그램에 적용하고 싶다면, LangChain이라는 오픈소스를 눈여겨 볼 필요가 있다. 이번 포스트는 LangChain의 기본 요소들을 소개하고, 간단한 코드를 예시로 실행 결과를 보여줄 것이다.

1. Moduls

1.1. Models

LangChain의 LLM을 설정하는 모듈
LLMs와 Chat Models이 따로 있음
- Chat Models는 내부적으로 LLMs를 사용하지만 인터페이스만 서로 다름
- 입출력 형태의 API가 아닌 메세지 형태의 API
스트리밍 답변 생성 가능 – link
HuggingFace Hub & Local Pipeline (자체모델) 사용 가능 – link

1.2. Prompts

모델에 들어가는 입력 설정
PromptTemplate이라는 기능을 제공

LangChain: If you know ChatGPT, without question

Chat Prompt Template 제공 – link
Output Parser 제공 – link

1.3. Indexes

Indexes는 LLM이 데이터와 잘 상호 작용할 수 있도록 데이터를 구조화 하는 방법
텍스트 데이터 작업을 위한 유틸리티 함수 제공 (Document Loaders, Text Splitters, Vectorstores, Retrievers)
데이터 관리 및 DB 연동, 내부 DB 검색 등의 활용 가능

1.4. Memory

기본적으로 LangChain의 모든 모듈은 서로 독립적이며, 각 모듈들은 입력을 처리하고 따로 저장 X
이전 interaction을 저장하고 단기, 장기 기억력을 갖기 위해 만들어진 모듈

1.5. Chains

모듈들을 서로 이어주기 위한 모듈
run() 함수를 통해 chain 실행
연속적인 run() 함수 사용으로 활용 가능
아래는 LLMs와 Memory를 Chain으로 연결한 코드

1.6. Agents

사전 결정된 Chains 호출이 아닌, 잠재적으로 사용자 입력에 따라 필요한 Chains를 acess할 수 있게 도와주는 모듈
Action Agents vs. Plan-and-Execute Agents
Agents는 사용자의 입력에 따라 호출할 Tools를 결정 가능
- Tools: Agent가 actions으로 취할 수 있는 것
- Toolkits: 유즈케이스 기반 사전 정의된 Tools의 묶음
- Agent Executor: Agents와 Tools를 Wrapping한 것, stop될 때까지 Agents를 실행하는 루프를 담당
(PromptTemplate, Language Model, Output Parser)을 기본적으로 구성한다.

2. Example Code

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os

from langchain.agents import AgentType, initialize_agent
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferWindowMemory
from langchain.tools import Tool
from langchain.utilities import GoogleSearchAPIWrapper

os.environ[
"OPENAI_API_KEY"] = ""
os.environ["GOOGLE_API_KEY"] = ""
os.environ["GOOGLE_CSE_ID"] = ""

llm = OpenAI(temperature=0)

search = GoogleSearchAPIWrapper()
tools = [
    Tool(name="Google Search",
         func=search.run,
         description="Search Google for recent results.")
]

memory = ConversationBufferWindowMemory(memory_key="chat_history",
                                        k=5,
                                        return_messages=True)

agent_chain = initialize_agent(
    tools,
    llm,
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    verbose=True,
    memory=memory)


while True:
    agent_chain.run(input('Instruction: '))

3. Execute Results

4. Todo

모든 모듈들의 상세 내용 및 쓰임새 더 공부
대화형 Agent에 특정 성격 반영
Inference 속도 더 빠르게 할 수 있는 방법 모색

LangChain: If you know ChatGPT, without question

Related Posts