How to download data from Snowflake
Snowflake is a cloud data platform that offers solutions to a variety of data management problems faced by modern organizations. Snowflake is most commonly used by organizations as a data warehouse to perform online analytical processing (OLAP).
In this respect, Snowflake can be used to solve the same class of problems that other cloud data warehouses such as Google Big Query and Amazon Redshift can solve. However, Snowflake provides its customers with several distinguishing features and capabilities such as multi-cloud flexibility, separation of storage and compute which allows for each to scale independently of the other, and many data privacy, security, and sharing features.
While Snowflake may be a great solution fit (or not) for your organization, many users will likely encounter challenges extracting data from the platform when necessary. This post highlights a few ways of extracting data from Snowflake should the need arise.
- Web interfaces
- Unloading data with stages
- Using third-party clients
Many Snowflake users primarily interact with the platform through its web interfaces. Snowflake’s Classic Web Interface was the only option users had for a long time, but Snowflake’s more modern web interface called Snowsight is now concurrently available to users. Snowsight addresses limitations of the Classic Web Interface, and provides additional features such as allowing eligible users to access their Data Exchange.
One of the Classic Web Interface limitations that Snowsight addresses is the limit on how much of a query result can be downloaded. In the Classic Web Interface, users can only download the first 100 MB of their query result as a CSV file. In contrast, Snowsight allows users to download the full result of their queries as a CSV file.
If you have encountered this limitation on the amount of data you can download in the Classic Web Interface, making the switch to Snowsight is the fastest and easiest way to overcome it. Furthermore, while the Classic Web Interface is currently fully supported, Snowflake may deprecate it in the future. Snowsight is the present and (foreseeable) future of Snowflake’s web interface. I highly recommend making the switch.
SnowSQL is the command line client for connecting to Snowflake to execute SQL queries and perform all DDL and DML operations, including loading data into and unloading data out of database tables. (SnowSQL — Snowflake Documentation)
You can download, install, and configure SnowSQL, Snowflake’s CLI tool for interacting with their data platform to download data from Snowflake. Once SnowSQL is correctly configured, you can query and download results interactively or non-interactively.
The following SnowSQL shell command executes the query in the
query_all_songs.sql file and outputs the result to a CSV file called
output_file.csv in the current local directory.$ snowsql -f query_all_songs.sql -o output_format=csv -o header=true -o timing=false -o friendly=false > output_file.csv
You can learn more about SnowSQL options here, but here is a summary of the options used in the above command.
-f query_all_songs.sql # specifies file with SQL query to execute -o output_format=csv # output file should be CSV -o timing=false # remove query timing information -o friendly=false # remove startup and exit messages
To interactively work with SnowSQL, execute the SnowSQL shell command with correct Snowflake credentials, for example (assumes a default connection is configured in the SnowSQL config file):
$ snowsql > !set output_format=csv > !set header=true > !set timing=false> !set friendly=false > !set output_file=output_file.csv # if SQL query is in a file > !source query_all_songs.sql # if directly writing query > SELECT * FROM kaggle.dl_course_data_v5.songs;
Unloading data with stages
Snowflake’s versatile COPY INTO command has a dizzying amount of options that allow users to load and unload data in a variety of ways. In addition to Snowflake’s internal and external STAGEs, and SnowSQL, users can download their query results to an authorized cloud location (using an external stage) or to a local directory (using an internal stage and SnowSQL’s GET command).
Rich Murnane provides a template for performing the latter. The former varies quite a bit because of the different cloud service providers Snowflake supports, but the Snowflake documentation provides excellent details for this functionality.
Using third-party clients
Many database uses connect to databases using clients not provided by their database vendor. These third-party clients, which are too many to enumerate or discuss in detail, often provide functionality for downloading query results to a local file.
Downloading or extracting data from Snowflake, and especially downloading query results, can seem daunting for the uninitiated. The easiest way to download your data is by using Snowsight, Snowflake’s modern web interface. Users who work more at the command line can use SnowSQL, Snowflake’s CLI tool, while users steeped in a cloud environment may opt to use Snowflake’s COPY INTO command with external stages. Some Snowflake users primarily interact with the platform through their favorite third-party database clients, and they can use whatever functionality provided by the client to download data.