Snowflake, the data cloud company, has made a significant advancement with the launch of Document AI, a new interface based on large language models (LLM). This innovative solution enables enterprises to efficiently extract valuable insights from their vast array of documents. Snowflake, originally focused on structured data, is now expanding its capabilities to mobilize unstructured information that is typically scattered across different platforms and locations. The introduction of Document AI marks a new era for customers, leveraging artificial intelligence (AI) to drive insights and revolutionize the utilization of data.
How Does Document AI Work?
Snowflake acquired Poland-based Applica, an AI platform for document understanding, in September 2022. The technology from this acquisition now powers Document AI. With this interface, enterprise users can simply express their requirements in natural language. The system automatically processes the query and extracts the necessary content and analytical insights from the requested document, whether it is an invoice, contract, or any other type of document. Christian Kleinerman, Snowflake’s SVP of product, explained that customers will experience a seamless end-to-end solution where they can store their documents in Snowflake and ask structured questions about the content. The system then converts the unstructured files into structured data, enabling traditional analytics, business intelligence, and downstream machine learning (ML) processes.
The underlying technology driving Document AI is Applica’s multimodal LLM, designed specifically for processing language queries. Snowflake is actively working to expand the capabilities of this system to cover various types of unstructured data, such as images, text files, and videos. This move aligns with IDC’s projection that over the next five years, more than 90% of the world’s data will be unstructured.
In addition to Document AI, Snowflake also introduced updates for Iceberg tables and ML-powered SQL functions. The integration of Iceberg tables allows enterprises to merge native and external tables into a unified format, extending the value of data cloud to Iceberg data. Furthermore, ML-powered SQL functions empower non-technical data users to leverage machine learning capabilities for use cases like forecasting and anomaly detection.
Snowflake also unveiled the Snowflake Performance Index, a comprehensive metric that quantifies query performance for enterprises. This index provides valuable insights into query optimization and efficiency. Additionally, two new cost optimization tools, Budgets and Warehouse Utilization, were introduced. Budgets enable enterprises to set spending thresholds on compute resources and receive alerts when limits are about to be exceeded. Warehouse Utilization offers visibility into the utilization of compute clusters, allowing enterprises to downscale when possible and optimize costs.
Snowflake Summit, where these announcements were made, is taking place from June 26 to 29 in Las Vegas.
Leave a Reply