Data

Dec192023

Ceph : 오픈소스 분산 저장 시스템

[분석AI서비스팀 이현정] 관리해야 할 AI모델과 데이터가 많아지고 그 용량이 커지면서 필자가 관심을 가지게 된 Ceph에 대해 간략히 소개해 보고자 합니다. Ceph 이란?Ceph은 단일 분산 컴퓨터 클러스터에 오브젝트 스토리지를 구현하는 오픈…

Nov72023

NLP Trend Data

Vector Database: 벡터 임베딩을 저장하고 검색하는 가장 효율적인 방법

NLP, Trend, Data

[선행AI기술팀 김윤혜] 2023년 IT 분야를 휩쓸었던 가장 핫한 이슈는 단연 ChatGPT입니다. ChatGPT는 모두가 쉽게 사용할 수 있는 대화형 거대 언어 인공지능 챗봇으로, 글로벌 사회에 생성형 AI에 대한 큰 임팩트와 유행을…

May252023

NLP Trend Data

Domain-specific language model의 필요성

NLP, Trend, Data

[가상생명연구팀 양승무 주임] ChatGPT의 시대가 도래하고 있습니다. AI 업계를 비롯한 다양한 산업과 분야에서도 ChatGPT의 우수성과 실용성이 인정되어, 많은 기업들이 ChatGPT의 적용을 추진하고 있습니다. 이러한 추세는 OpenAI와 같은 주요 기업들 뿐만…

Sep192022

Trend Data

Feature store: Fully managed service for ML Feature

Trend, Data

[분석지능개발팀 임창대] What is Feature?ML(Machine Learning) 은 과거의 예시 데이터를 학습한 모델을 기반으로 새로운 데이터 예측을 수행합니다.ML 모델 학습에서 표 형태의 2차원 데이터를 사용하였을 때 행이 예시이고 열이 해당 예시를…

Jun242022

NLP Trend Data

SmileStyle 한국어 대화 스타일 변환 데이터셋

NLP, Trend, Data

[생성지능개발팀 김성현] 저희 센터의 인공지능 연구 모토는 ‘Human-like AI’ & ‘Fun AI’ 입니다. 그렇다면, 단순히 날씨나 뉴스를 알려주는 챗봇을 넘어, 친근하고, ‘사람 같은’ 인공지능은 어떻게 만들 수 있을까요?저희는 그러한 요소를…

Jan282022

Trend Data

AI Fairness:편견 없는 인공지능을 위하여

Trend, Data

[서비스개발팀 임용택] 2015년 6월, 미국 브루클린의 한 흑인 프로그래머는 여자친구와 찍은 사진을 보려던 중 깜짝 놀랄 일을 경험합니다. 구글 포토에 본인들의 사진이 “고릴라” 로 오토 태깅된 것을 보았기 때문입니다. 구글은…

Jul242021

Interaction Trend Data

Blender 2.0 overcomes the limitations of Open Domain chatbots

Interaction, Trend, Data

[Prior Research Team Jihyun Song] It has been over 2 years since I was interested in Open Domain chatbot and came across papers on Blender 1.0 and Meena. At that time, they had a consistent long-ton conversation that they claimed they would overcome in the future, and about knowledge…

Jun252021

Trend Code Data

Handling Imbalanced Datasets

Trend, Code, Data

[Service Development Team Hwang Jun-sun] When supervising machine learning models, when a dataset with an unbalanced number of data between labels is used as training data, the phenomenon in which learning of samples belonging to a label with a small ratio is not performed well you will experience simply…

Jun232021

Visual Data

Learning Loss for Active Learning

Visual, Data

[Service Development Team, Kyunghwan Lee] We usually encounter unlabeled data bundles in the process of learning a model, and often run into data annotation problems. Labeling all unlabeled data is too time-consuming and expensive...