{"id":64257,"date":"2023-11-07T17:42:26","date_gmt":"2023-11-07T08:42:26","guid":{"rendered":"https:\/\/smilegate.ai\/?p=64257"},"modified":"2023-11-07T17:42:28","modified_gmt":"2023-11-07T08:42:28","slug":"vector-database-%eb%b2%a1%ed%84%b0-%ec%9e%84%eb%b2%a0%eb%94%a9%ec%9d%84-%ec%a0%80%ec%9e%a5%ed%95%98%ea%b3%a0-%ea%b2%80%ec%83%89%ed%95%98%eb%8a%94-%ea%b0%80%ec%9e%a5-%ed%9a%a8%ec%9c%a8%ec%a0%81","status":"publish","type":"post","link":"https:\/\/smilegate.ai\/cn\/2023\/11\/07\/vector-database-%eb%b2%a1%ed%84%b0-%ec%9e%84%eb%b2%a0%eb%94%a9%ec%9d%84-%ec%a0%80%ec%9e%a5%ed%95%98%ea%b3%a0-%ea%b2%80%ec%83%89%ed%95%98%eb%8a%94-%ea%b0%80%ec%9e%a5-%ed%9a%a8%ec%9c%a8%ec%a0%81\/","title":{"rendered":"Vector Database: \ubca1\ud130 \uc784\ubca0\ub529\uc744 \uc800\uc7a5\ud558\uace0 \uac80\uc0c9\ud558\ub294 \uac00\uc7a5 \ud6a8\uc728\uc801\uc778 \ubc29\ubc95"},"content":{"rendered":"

[\uc120\ud589AI\uae30\uc220\ud300 \uae40\uc724\ud61c]<\/p>\n\n\n\n

2023\ub144 IT \ubd84\uc57c\ub97c \ud729\uc4f8\uc5c8\ub358 \uac00\uc7a5 \ud56b\ud55c \uc774\uc288\ub294 \ub2e8\uc5f0 ChatGPT\uc785\ub2c8\ub2e4. ChatGPT\ub294 \ubaa8\ub450\uac00 \uc27d\uac8c \uc0ac\uc6a9\ud560 \uc218 \uc788\ub294 \ub300\ud654\ud615 \uac70\ub300 \uc5b8\uc5b4 \uc778\uacf5\uc9c0\ub2a5 \ucc57\ubd07\uc73c\ub85c, \uae00\ub85c\ubc8c \uc0ac\ud68c\uc5d0 \uc0dd\uc131\ud615 AI\uc5d0 \ub300\ud55c \ud070 \uc784\ud329\ud2b8\uc640 \uc720\ud589\uc744 \ubd88\ub7ec\uc77c\uc73c\ucf30\uc2b5\ub2c8\ub2e4. <\/p>\n\n\n\n

\ud558\uc9c0\ub9cc ChatGPT\ub294 \uc798\ubabb\ub41c \uc815\ubcf4\ub97c \ub9c8\uce58 \uc815\ub2f5\uc778 \uac83\ucc98\ub7fc \ub2f5\ubcc0\ud558\ub294 “\ud560\ub8e8\uc2dc\ub124\uc774\uc158(Hallucination)”\uc774 \ubc1c\uc0dd\ud558\ub294 \uce58\uba85\uc801\uc778 \ubb38\uc81c\uac00 \uc788\uc2b5\ub2c8\ub2e4. \ub530\ub77c\uc11c \uc774\ub97c \ubcf4\uc644\ud558\uae30 \uc704\ud574 ChatGPT\uc758 \ub1cc \uc5ed\ud560\uc744 \ud560 \uc218 \uc788\ub294 “Vector Database”\uac00 \ud568\uaed8 \uae09\ubd80\uc0c1\ud558\uc600\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n

\uadf8\ub807\ub2e4\uba74 \uc774 Vector Database\ub294 \ubb34\uc5c7\uc774\uace0, \uc5b4\ub5bb\uac8c ChatGPT\uc758 \ud55c\uacc4\ub97c \ubcf4\uc644\ud560 \uc218 \uc788\ub294 \uac78\uae4c\uc694?<\/p>\n\n\n\n

<\/p>\n\n\n\n


\n\n\n\n

Vector Database<\/h1>\n\n\n\n

Vector Database\ub294 \ud6a8\uc728\uc801\uc73c\ub85c Vector embedding\uc744 \uc800\uc7a5\ud558\uace0 sementic search\ub97c \ud560 \uc218 \uc788\ub294 Database\uc785\ub2c8\ub2e4.<\/p>\n\n\n\n

\uae30\uc874 \ub370\uc774\ud130\ubca0\uc774\uc2a4\ub3c4 embedding\uc744 \uc800\uc7a5\ud558\uace0 sementic search\ub97c \ud560 \uc218 \uc788\uc9c0\ub9cc, \uc800\uc7a5\ub41c \ubaa8\ub4e0 \ub370\uc774\ud130\uc758 embedding\uacfc query embedding\uc758 similarity\ub97c \uc5f0\uc0b0\ud574\uc57c \ud558\uae30 \ub54c\ubb38\uc5d0 \ub108\ubb34 \ub9ce\uc740 \uc5f0\uc0b0\uc744 \ud574\uc57c \ud558\uace0 \ub290\ub9ac\uac8c \uacb0\uacfc\ub97c \uac00\uc838\uc624\uac8c \ub429\ub2c8\ub2e4.<\/p>\n\n\n\n

\ubc18\uba74 Vector Database\ub294 \uc790\uccb4\uc801\uc778 \uc54c\uace0\ub9ac\uc998\uc73c\ub85c indexing\ud558\uace0 \ub2e4\uc591\ud55c ANN \uae30\ubc18\uc758 \uc720\uc0ac\ub3c4 \uc54c\uace0\ub9ac\uc998\uc744 \uc870\ud569\ud558\uc5ec \ud6e8\uc52c \ube60\ub974\uace0 \ud6a8\uacfc\uc801\uc73c\ub85c \uac80\uc0c9\ud569\ub2c8\ub2e4.<\/p>\n\n\n\n

<\/p>\n\n\n\n

Pipeline<\/h2>\n\n\n
\n
\"\"
vector database pipeline<\/figcaption><\/figure><\/div>\n\n\n
    \n
  • Indexing: HNSW, LSH, IVF\uc640 \uac19\uc740 \uc54c\uace0\ub9ac\uc998\uc744 \uc0ac\uc6a9\ud558\uc5ec \ubca1\ud130\ub97c indexing\ud569\ub2c8\ub2e4. \ube60\ub978 \uac80\uc0c9\uc774 \uac00\ub2a5\ud55c data structure\uc5d0 vector\ub97c mapping\ud569\ub2c8\ub2e4.<\/li>\n\n\n\n
  • Querying: \ud574\ub2f9 index\uc758 similarity metric\uc744 \ud1b5\ud574 query vector\uc640 indexed vectors \uac04\uc758 \uc720\uc0ac\ub3c4\ub97c \uacc4\uc0b0\ud569\ub2c8\ub2e4.<\/li>\n\n\n\n
  • Post Processing: \uac80\uc0c9 \uacb0\uacfc\ub97c \ub2e4\ub978 similarity metric\uc744 \ud1b5\ud574 re-ranking\ud558\ub294 \ub4f1 \uac80\uc0c9 \uacb0\uacfc\ub97c \ud6c4\ucc98\ub9ac\ud558\ub294 \uc791\uc5c5\uc744 \ud569\ub2c8\ub2e4.<\/li>\n<\/ul>\n\n\n\n

    <\/p>\n\n\n\n

    Indexing Algorithm: HNSW<\/h2>\n\n\n\n

    Vector Database\uac00 index \uc0dd\uc131\uc5d0 \uc0ac\uc6a9\ud558\ub294 \ub300\ud45c\uc801\uc778 ANN \uc54c\uace0\ub9ac\uc998 \uc911 \ud558\ub098\uc778 HNSW\uc5d0 \ub300\ud574 \uc54c\uc544\ubcf4\uaca0\uc2b5\ub2c8\ub2e4. HNSW(Hierarchical Navigable Small World)\uc740 \ud2b8\ub9ac\uc758 \uac01 \ub178\ub4dc\uac00 \ubca1\ud130\uc758 \uc9d1\ud569\uc778 \uacc4\uce35\uc801 \ud2b8\ub9ac\ud615 \uad6c\uc870\ub97c \uc0dd\uc131\ud569\ub2c8\ub2e4. K-means\uc640 \uac19\uc740 \uc54c\uace0\ub9ac\uc998\uc744 \ud1b5\ud574 \ubca1\ud130\ub97c \ud074\ub7ec\uc2a4\ud130\ub9c1\ud558\uc5ec \uac01 \ub178\ub4dc\ub97c \uad6c\uc131\ud558\uace0, \uc11c\ub85c \uc720\uc0ac\ub3c4\uac00 \ub192\uc740 \ub178\ub4dc \uac04\uc5d0 edge\ub97c \ub9cc\ub4ed\ub2c8\ub2e4. \ud574\ub2f9 \ud2b8\ub9ac \uad6c\uc870\uc758 \uc778\ub371\uc2a4\ub97c \ud0d0\uc0c9\ud558\uc5ec query vector\uc640 \uac00\uc7a5 \uac00\uae4c\uc6b4 \ub178\ub4dc\uc5d0 \ubc29\ubb38\ud558\uc5ec \uac80\uc0c9\ud569\ub2c8\ub2e4.<\/p>\n\n\n

    \n
    \"\"
    HNSW<\/figcaption><\/figure><\/div>\n\n\n

    Similarity Metrics<\/h2>\n\n\n\n

    \ub2e4\uc74c\uc740 vector database\uc5d0\uc11c vector \uac04 similarity\ub97c \uacc4\uc0b0\ud558\uae30 \uc704\ud574 \uc0ac\uc6a9\ud558\ub294 \ub300\ud45c\uc801\uc778 Metrics\uc785\ub2c8\ub2e4.<\/p>\n\n\n\n

      <\/ul>\n\n\n\n
        \n
      • Euclidean distance<\/li>\n\n\n\n
      • Dot Product<\/li>\n\n\n\n
      • Cosine similarity<\/li>\n<\/ul>\n\n\n\n

        <\/p>\n\n\n\n


        \n\n\n\n

        Vector Database + LLM<\/h1>\n\n\n\n

        LLM\uc758 \ubb38\uc81c\uc810<\/h2>\n\n\n\n

        1. \ud560\ub8e8\uc2dc\ub124\uc774\uc158(Hallucination)<\/h3>\n\n\n\n

        LLM\uc740 \ud5c8\uc704 \uc815\ubcf4\ub97c \uc0ac\uc2e4\ucc98\ub7fc \ub2f5\ubcc0\ud558\ub294 \ud560\ub8e8\uc2dc\ub124\uc774\uc158 \ubb38\uc81c\uac00 \uc788\uc2b5\ub2c8\ub2e4. \ub2e4\uc74c\uc740 ChatGPT \ud560\ub8e8\uc2dc\ub124\uc774\uc158 \uc608\uc2dc\uc785\ub2c8\ub2e4. \uac70\ubd81\uc120\uc5d0\ub294 “\ub77c\uc774\ud2b8\ub2dd \ubcfc\ud2b8”\ub77c\ub294 \ubb34\uae30\uac00 \uc5c6\uc74c\uc5d0\ub3c4 \uc774\uc5d0 \ub300\ud574 \uc124\uba85\ud569\ub2c8\ub2e4.<\/p>\n\n\n\n

        \"\"<\/figure>\n\n\n\n

        <\/p>\n\n\n\n

        2. \uc7a5\uae30 \uae30\uc5b5 \ubb38\uc81c(Long Term Memory)<\/h3>\n\n\n\n

        LLM\uc740 \uc624\ub798\uc804\uc5d0 \ub300\ud654\ud55c \ub0b4\uc6a9\uc744 \uae30\uc5b5\ud558\uc9c0 \ubabb\ud558\ub294 \uc7a5\uae30 \uae30\uc5b5 \ubb38\uc81c\uac00 \uc788\uc2b5\ub2c8\ub2e4. \ub2e4\uc74c\uc740 ChatGPT \uc7a5\uae30 \uae30\uc5b5 \ubb38\uc81c \uc608\uc2dc\uc785\ub2c8\ub2e4. \uacfc\uac70\uc5d0 \uacbd\uc8fc\uc5d0 \uac08 \uacc4\ud68d\uc774\ub77c\uace0 \ub9d0\ud588\uc9c0\ub9cc \ub098\uc911\uc5d0 \ubb3c\uc5b4\ubd24\uc744 \ub550 \uc774\ub97c \uae30\uc5b5\ud558\uc9c0 \ubabb\ud569\ub2c8\ub2e4.<\/p>\n\n\n

        \n
        \"\"<\/figure><\/div>\n\n\n
        \"\"<\/figure>\n\n\n\n

        <\/p>\n\n\n\n

        RAG(Retrieval Augmented Generation)<\/h2>\n\n\n\n

        \uadf8\ub807\ub2e4\uba74 vector database\ub294 \uc5b4\ub5bb\uac8c \uc774\ub7ec\ud55c LLM\uc758 \ubb38\uc81c\ub97c \ubcf4\uc644\ud560 \uc218 \uc788\uc744\uae4c\uc694? RAG(Retrieval Augmented Generation)\ub294 query\uc5d0 \uad00\ub828\ub41c \uc815\ubcf4\ub97c Vector database\uc5d0\uc11c \uac80\uc0c9\ud55c \ud6c4 \uc774\ub97c \uae30\ubc18\uc73c\ub85c \ub2f5\ubcc0\uc744 \uc0dd\uc131\ud558\ub294 \ubc29\ubc95<\/strong>\uc744 \uc77c\uceeb\uc2b5\ub2c8\ub2e4. \uae30\uc874\uc5d0\ub294 Retrieval model\uacfc Generation \ubaa8\ub378\uc744 \ud568\uaed8 \ud559\uc2b5\uc2dc\ud0a4\ub294 \uac83\uc744 RAG\ub77c\uace0 \ud558\uc600\uc9c0\ub9cc, LLM\uc774 \ub4f1\uc7a5\ud558\uba70 \ud1b5\uce6d\ub418\uae30 \uc2dc\uc791\ud588\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n

        \n
        \"\"

        RAG \uad6c\uc870<\/figcaption><\/figure><\/div>\n\n\n

        \uc704 \uad6c\uc870\ucc98\ub7fc Vector database\ub97c LLM \ubaa8\ub378\uc758 \uc9c0\uc2dd \ubc94\uc704\ub85c \uc9c0\uc815\ud558\uc5ec \ub2f5\ubcc0\uc744 \ud558\uac8c \ud568\uc73c\ub85c\uc368 \uac80\uc0c9\ub41c \uc815\ubcf4\ub97c \uae30\ubc18\uc73c\ub85c \ub2f5\ubcc0\uc744 \uc0dd\uc131\ud558\uac8c \ud558\uc5ec, LLM\uc758 \ud560\ub8e8\uc2dc\ub124\uc774\uc158\uc744 \ubc29\uc9c0<\/strong>\ud558\uace0 \uc624\ub798\ub41c \ub300\ud654 \uae30\ub85d\uae4c\uc9c0 \ubc18\uc601<\/strong>\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n

        <\/p>\n\n\n\n

        \ub9c8\ubb34\ub9ac<\/h3>\n\n\n\n

        context\ub97c \ubc18\uc601\ud55c \uc758\ubbf8\ub860\uc801 \uac80\uc0c9\uc5d0 \uc720\ub9ac\ud55c vector database\ub294 \uc9c0\uae08\uae4c\uc9c0 \uc8fc\ub85c \ucd94\ucc9c \uc2dc\uc2a4\ud15c, QA \ub4f1\uc5d0 \ud65c\uc6a9\ub418\uc5b4 \uc654\uc2b5\ub2c8\ub2e4. \ud558\uc9c0\ub9cc 2023\ub144, LLM\uc774 \ub728\uac70\uc6b4 \uac10\uc790\ub85c \ub5a0\uc624\ub974\uba70 \uc774\ub97c \ubcf4\uc644\ud558\uae30 \uc704\ud55c \ubc29\ubc95\uc73c\ub85c \uc8fc\ubaa9\ubc1b\uace0 \uc788\uc2b5\ub2c8\ub2e4. \ud604\uc7ac vector database\ub294 \uac70\ub300 \uc5b8\uc5b4 \uc0dd\uc131 \ubaa8\ub378\uc758 \ud55c\uacc4\ub97c \uadf9\ubcf5\ud558\uace0 \ub6f0\uc5b4\ub09c \uc758\ubbf8\ub860\uc801 \uac80\uc0c9\uc744 \uc81c\uacf5\ud558\ub294 \ud575\uc2ec \uc5ed\ud560\uc744 \ud558\uace0 \uc788\uc2b5\ub2c8\ub2e4. \ucc57\ubd07\uc744 \uac1c\ubc1c\ud558\uba70 \uc5ec\ub7ec \uae30\uc220\ub4e4\uc774 \uc870\ud569\ub418\uc5b4\uc57c \uae30\uc220\uc758 \ud750\ub984\uc774 \uc644\uc131\ub41c\ub2e4\ub294 \uac83\uc744 \uc54c\uac8c \ub418\uc5c8\uc2b5\ub2c8\ub2e4. \uc55e\uc73c\ub85c \uc5b4\ub5a4 \uae30\uc220\uc758 \ubc1c\uc804\uc744 \ub9c8\uc8fc\ud558\uace0 \ub300\ucc98\ud558\uac8c \ub420\uc9c0 \uae30\ub300\b\b\ub429\ub2c8\ub2e4.<\/p>\n\n\n\n

        \uac10\uc0ac\ud569\ub2c8\ub2e4.<\/p>\n

        <\/span><\/div>","protected":false},"excerpt":{"rendered":"

        [\uc120\ud589AI\uae30\uc220\ud300 \uae40\uc724\ud61c] 2023\ub144 IT \ubd84\uc57c\ub97c \ud729\uc4f8\uc5c8\ub358 \uac00\uc7a5 \ud56b\ud55c \uc774\uc288\ub294 \ub2e8\uc5f0 ChatGPT\uc785\ub2c8\ub2e4. ChatGPT\ub294 \ubaa8\ub450\uac00 \uc27d\uac8c \uc0ac\uc6a9\ud560 \uc218 \uc788\ub294 \ub300\ud654\ud615 \uac70\ub300 \uc5b8\uc5b4 \uc778\uacf5\uc9c0\ub2a5 \ucc57\ubd07\uc73c\ub85c, \uae00\ub85c\ubc8c \uc0ac\ud68c\uc5d0 \uc0dd\uc131\ud615 AI\uc5d0 \ub300\ud55c \ud070 \uc784\ud329\ud2b8\uc640 \uc720\ud589\uc744 \ubd88\ub7ec\uc77c\uc73c\ucf30\uc2b5\ub2c8\ub2e4. \ud558\uc9c0\ub9cc ChatGPT\ub294 \uc798\ubabb\ub41c \uc815\ubcf4\ub97c \ub9c8\uce58 \uc815\ub2f5\uc778 \uac83\ucc98\ub7fc \ub2f5\ubcc0\ud558\ub294 “\ud560\ub8e8\uc2dc\ub124\uc774\uc158(Hallucination)”\uc774 \ubc1c\uc0dd\ud558\ub294 \uce58\uba85\uc801\uc778 \ubb38\uc81c\uac00 \uc788\uc2b5\ub2c8\ub2e4. \ub530\ub77c\uc11c \uc774\ub97c \ubcf4\uc644\ud558\uae30 \uc704\ud574 ChatGPT\uc758 \ub1cc \uc5ed\ud560\uc744 \ud560 \uc218 \uc788\ub294 “Vector…<\/p>\n

        <\/span><\/div>","protected":false},"author":1,"featured_media":64275,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_lock_modified_date":false,"footnotes":""},"categories":[532,19,197],"tags":[],"class_list":["post-64257","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-nlp","category-tech04","category-data","category-532","category-19","category-197","description-off"],"_links":{"self":[{"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/posts\/64257","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/comments?post=64257"}],"version-history":[{"count":11,"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/posts\/64257\/revisions"}],"predecessor-version":[{"id":64288,"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/posts\/64257\/revisions\/64288"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/media\/64275"}],"wp:attachment":[{"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/media?parent=64257"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/categories?post=64257"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/smilegate.ai\/cn\/wp-json\/wp\/v2\/tags?post=64257"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}