{"id":84403,"date":"2023-07-27T09:03:13","date_gmt":"2023-07-27T09:03:13","guid":{"rendered":"https:\/\/www.globallogic.com\/uk\/?post_type=insightsection&p=84403"},"modified":"2025-01-20T05:21:02","modified_gmt":"2025-01-20T05:21:02","slug":"exploring-snowpark-and-streamlit-for-data-science","status":"publish","type":"insightsection","link":"https:\/\/www.globallogic.com\/uki\/insights\/blogs\/exploring-snowpark-and-streamlit-for-data-science\/","title":{"rendered":"Exploring Snowpark and Streamlit for Data Science"},"content":{"rendered":"
I\u2019m Janki Makwana, a Data Scientist at GlobalLogic. I have been working with a major retail and commercial bank for the past year as a subject matter expert.<\/span>\u00a0<\/span><\/p>\n

Throughout this blog, I aim to shed light on the remarkable potential that emerges when we use Snowpark and Streamlit together for solutions in Data Science. I will explore the seamless integration of these two technologies and demonstrate how they can be effectively utilised.\u00a0<\/span>\u00a0<\/span><\/p>\n

By sharing practical examples of using these technologies, I will show how Snowpark and Streamlit enable the creation of proof-of-concepts and provide valuable solutions without relying on traditional cloud providers like AWS, GCP, or Azure.<\/span>\u00a0<\/span><\/p>\n

First things first:<\/span>\u00a0<\/span><\/p>\n

\u00a0<\/span><\/p>\n

What is Snowpark?\u00a0<\/span>\u00a0<\/span><\/h4>\n

Snowpark is a new feature in Snowflake, a cloud-based data warehousing platform. It allows developers to write complex data transformations, analytics, and machine learning models using familiar programming languages like Java, Scala, and Python. This allows for complex data processing and analysis to be performed directly within Snowflake, without the need to move data between different systems or languages.<\/span>\u00a0<\/span><\/p>\n

\u00a0<\/span><\/p>\n

What is Streamlit?\u00a0<\/span>\u00a0<\/span><\/h4>\n

Streamlit is an open-source framework that simplifies the process of creating web applications. As it is built on top of Python, users can easily leverage the multiple pre-built components Streamlit offers \u2013 including text inputs, buttons, and sliders. Additionally, users can use data visualisation libraries \u2013 like Matplotlib and Plotly \u2013to create interactive and dynamic graphs.<\/span>\u00a0<\/span><\/p>\n

With this being the first time I used Snowpark, I wanted to explore this new technology and figure out where and how it would be used within the data science realm. The best way to find this out, was to use Snowpark and Streamlit first hand and see how easy or hard it will be to use.<\/p>\n

\u00a0<\/span><\/p>\n

The Demo:<\/span>\u00a0<\/span><\/h4>\n

I created a demo to provide first-hand insights into how easy or hard it may be to use Snowpark and Streamlit. This demo involved scraping tweets from a particular Twitter user and running sentiment analysis on the tweets to generate valuable insights into the user’s profile.\u00a0<\/span>\u00a0<\/span><\/p>\n

Using Streamlit, I presented the results in an interactive manner; displaying information such as the user’s sentiment trends and the topics they most frequently tweet about.<\/span>\u00a0<\/span><\/p>\n

\"\"<\/p>\n

\u00a0<\/span>\u00a0<\/span><\/p>\n

The process:<\/span>\u00a0<\/span><\/h4>\n

The first step was to find a tweet scraping tool that would allow me to retrieve a large number of tweets in a single query.\u00a0<\/span>\u00a0<\/span><\/p>\n

After some research, I decided to use <\/span>SNScrape<\/span><\/a> \u2013 a scraping tool for social networking services that can scrape user profiles, hashtags, and searches, and return discovered items like relevant posts.<\/span>\u00a0<\/span><\/p>\n

To retrieve the tweets, it\u2019s as simple as just a line of code \u2013 this line of code retrieves the tweets and inserts it into a json file.<\/span>\u00a0<\/span><\/p>\n

\u00a0<\/span><\/p>\n

snscrape –jsonl twitter-user {username} >twitter-@{username}.json<\/a>\u00a0<\/em><\/p>\n

\u00a0<\/span><\/p>\n

The next step was to clean and prepare the tweets. Once cleaned, these tweets were ready for analysis and helped improve the accuracy of the sentiment prediction model.\u00a0<\/span>\u00a0<\/span><\/p>\n

Cleaning the tweets included:<\/span>\u00a0<\/span><\/p>\n