· Alvaro Barber · Tutorials  · 5 min read

Snowflake Interview Questions

Common Questions and RBAC

Common Questions and RBAC
  1. Cloning Tell me some important project where you used Cloning, what you can say about this, why is beneficial?

It’s actually very beneficial for saving so much storage when trying to test your development. Also it ensures that by cloning, you can test your development with real Production data, not the same if you use temporary tables that make the function but for limited time and for limited tables

  1. General Snowflake Basics What are the key features of Snowflake that differentiate it from other data warehouses? Explain the architecture of Snowflake and its layers (Cloud Services, Query Processing, and Storage). How does Snowflake achieve scalability and performance optimization? What are Snowflake virtual warehouses? How do they affect performance and cost?

Cloud agnostic, decoupled storage and compute, you can upgrade horizontally or vertically, support semi strctured data handling json. No need of manual performance tuning with indexes etc… and fully managed for backup and scaling automatically, isolating sessions from the user. Queries can be reused reducing the cost. Support snowpark, stream and tasks It divides the data in micropartitions organized in columnar format(columnar is faster for analytics as only relevant columns are read) Frequenly used query results can be saved in materialized views.

  1. Data Loading and Unloading How do you load data into Snowflake? Provide an example using the COPY command.

2.1 CREATE THE TABLE 2.2 CREATE THE INTERNAL STAGE OR EXTERNAL STAGE FROM S3 OR BLOB WITH THE CORRECT FILE FORMAT. USE PUT in internal or create external stage with URL and CREDENTIALS 2.3 USE THE COPY INTO COMMAND TO COPY FROM STAGE TO TABLE.

What are the common file formats supported by Snowflake for data loading? JSON PARQUET AVRO ORC and XML - VARIANT and XML data type FLATTEN -> SELECT f.VALUE as item from my_table LATERAL FLATTEN(input => json_data:items) f; SELECT f.KEY , f.VALUE

Explain how external stages work in Snowflake. How do you configure them? Basically to connect to the cloud providers.

Describe the steps to unload data from Snowflake to an external location. The same but from external stage to the cloud provider

  1. SQL and Query Optimization How would you optimize a query in Snowflake to reduce runtime or costs?

CHECK SQL STACK,AVOID SELECTS WHOLE COLUMN, FILTER EARLY, TAKE CARE WITH THE JOINS AND SUBQUERIES AND JUST WHEN ITS REALLY NEEDED. GO TO QUERY OPTIMIZER

What are micro-partitions in Snowflake? How do they help in query performance? it performs filters and data pruning automatically with values that doesnt match the query executed. automatic compression so better for data storage.

Explain the concept of “clustering keys.” When would you use them? So it can check faster for those columns, but should be avoided for frequently updated tables otherwise it will be reclustering every time.(could increase costs)

How do you monitor and troubleshoot slow queries in Snowflake? Go to QUERY PROFILE or with the EXPLAIN clause and check for inneficiencies, scan time and data movement. you can order the queries by time and check.

  1. Security and Access Control What are the different roles in Snowflake? How do they relate to access control? ACCOUNTADMIN, SECURITYADMIN SYSADMIN and PUBLIC How do you implement row-level security in Snowflake? Explain Snowflake’s data encryption capabilities. How would you restrict access to a specific table or schema for a user? With GRANT or REVOKE. with USAGE of schema and SELECT ON , INSERT on tables etc…

  2. Data Sharing and Integration What is Snowflake Secure Data Sharing? How does it work? How can you integrate Snowflake with other tools, such as ETL platforms or BI tools? Explain the use of Snowflake’s external functions.

  3. Time Travel and Fail-safe What is the Time Travel feature in Snowflake? How can it be used? How does Snowflake handle data recovery with Fail-safe? What are the limitations of Time Travel in Snowflake?

  4. Scripting and Automation How do you automate tasks in Snowflake using Snowflake Tasks and Streams? Write a script to create a scheduled query execution in Snowflake. How would you implement Change Data Capture (CDC) in Snowflake?

  5. Data Engineering Describe how to handle schema evolution in Snowflake. How do Snowflake stages work, and how do they differ between internal and external stages?

Explain the concept of Snowflake pipes and their role in data streaming. The use of pipes is like continuous loading files, you just set same as for external stage , but with the pipe now it continuosly checks files in the stage. They work hand-in-hand with streams that tracks the changes or updates(important for incremental loads) The stream is created on the table. and you can guide it to check on METADATA$ACTION for inserts or updates. Unfortunately it has no way to check specific columns.

  1. Cost Management How do you calculate and monitor costs in Snowflake? What strategies can you implement to reduce Snowflake usage costs? Explain how Snowflake’s auto-suspend and auto-resume features work.

  2. Advanced Topics How does Snowflake handle semi-structured data like JSON or Parquet? What is a materialized view in Snowflake, and how does it differ from a regular view? How do you implement a multi-region Snowflake setup? What are the use cases for Snowflake’s Snowpark?

Back to Blog