Picture source: https://www.datagrom.com/data-science-machine-learning-ai-blog/snowflake-vs-databricks
[Service Development Team Jeon Jeon Jeon]
The digital transformation of companies accelerated by Corona 19 continues to increase the value of data. The need for change in various industries as well as specialized IT companies is raising ransom money for data specialists. Although the size of the domestic data industry market is growing in recent years, companies are not yet active and operating.
Korea Data Industry Promotion Agency (Summary of the main results of the 2020 Data Industry Status Survey https://www.kdata.or.kr/info/info_01_download.html?dbnum=462), the total size of the domestic data industry in 2020 is estimated to be around 19 trillion won. In contrast, the market capitalization of'Snow Flakes', a data platform company listed in the US last year, is currently over 60 trillion won. Although the difference in the overall market size is large, it can be said that data specialists are not yet highly evaluated in Korea.
In Korea, the data industry is being established through government projects, but the digital transformation of companies will have a great influence on the domestic data market as well. In order to see the outlook and influence of the data market, I would like to analyze'Snowflake' and'Databricks' among the US data specialists whose value is rapidly increasing in recent years.
Big 3 Cloud
According to a report published by US market research firm Canalis, the market share of Big 3 Clouds (AWS, MS Azure, GCP) is 32%, 19%, and 7%, respectively. These companies account for 60% of the total market. Snow Flake and Databrix are said to be a matchmaker for these big 3 cloud companies, but they have a different business model from those specialized in cloud companies. A company that focuses on managing, analyzing, and visualizing data stored in the cloud.
Snow flake
Snow Flake's platform service is provided on a cloud basis and is in partnership with a cloud company. It relies on the Big 3 Cloud, not its own cloud infrastructure. This is evaluated as an advantage and a disadvantage. Big 3 Cloud has its own analysis platform, and it is difficult to interchange between them, but the advantage is that all of them are compatible because Snow Flake plays a role in the middle.
As shown in the above platform structure, Snow Flake basically provides 6 services: Data Engineering, Data Lake, Data Ware House, Data Science, Data Applications, and Data Exchange. It can be seen that it can service all data services other than the original data storage, and it has more advantages in this part because it started as Data WareHouse. It is also an advantage to provide a data marketplace, etc.
Databrix
Databrix was founded by key developers such as Matei Zaharia of Apache Spark, an open source for big data processing. Databrix is a company that provides open source Apache Spark as a service.
Databrix provides data platform services similar to Snow Flake. Basically, Snow Flake is said to provide a data warehouse service. Data WareHouse is a structure in which data processed by a structured data model for reporting is stored. On the other hand, Databrix is said to provide a'Lake House' service, which is an extended concept from'Data Lake', which refers to the form of storing structured and unstructured data ready to be used for analysis. We provide services that process and analyze data based on such a lake house.
Snow Flake vs Databrix
Snow flake | Databrix | |
establish | 2012 | In 2013 |
Startup manpower | Developer from Oracle | Spark Developer |
Enterprise value/sales | About 61 trillion (14 trillion at the time of listing) / about 660 billion | Approx. 31 trillion / Approx.480 billion (unlisted) |
customer | 4,000 or so | 5,000 or so |
The two companies are in a partnership rather than a competitive position. The two companies are basically similar in that they use the Big 3 cloud platform and provide services to customers targeting big data, but Databrix focuses on data storage and processing, and Snow Flakes is a somewhat refined data. We are targeting services that help us process and analyze them. At the 2019 Spark AI Summit, a session was also held on how to use Databrix and Snow Flakes in conjunction.
Startups with outstanding technological prowess based on new AI technology, small-scale companies, indispensably need to consider a large-scale service architecture when conducting business. Large data storage, stable pipelines, scaling, etc. are different technical fields than their own, and are services that only large-scale companies can provide. That is why the value of cloud data platform companies is expected to increase further in the future.
However, the services provided by startups such as Snowflake and Databrix are actually many functions provided by Big 3 cloud companies, so we will have to see how differentiated points such as usability and specialized functions will emerge in the future. Since there are many companies with similar business models in Korea, I think these are the companies to pay attention to.