 
{"id":84403,"date":"2023-07-27T09:03:13","date_gmt":"2023-07-27T09:03:13","guid":{"rendered":"https:\/\/www.globallogic.com\/uk\/?post_type=insightsection&#038;p=84403"},"modified":"2025-01-20T05:21:02","modified_gmt":"2025-01-20T05:21:02","slug":"exploring-snowpark-and-streamlit-for-data-science","status":"publish","type":"insightsection","link":"https:\/\/www.globallogic.com\/uki\/insights\/blogs\/exploring-snowpark-and-streamlit-for-data-science\/","title":{"rendered":"Exploring Snowpark and Streamlit for Data Science"},"content":{"rendered":"<div class=\"classic_editor_content\"><span data-contrast=\"auto\">I\u2019m Janki Makwana, a Data Scientist at GlobalLogic. I have been working with a major retail and commercial bank for the past year as a subject matter expert.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Throughout this blog, I aim to shed light on the remarkable potential that emerges when we use Snowpark and Streamlit together for solutions in Data Science. I will explore the seamless integration of these two technologies and demonstrate how they can be effectively utilised.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">By sharing practical examples of using these technologies, I will show how Snowpark and Streamlit enable the creation of proof-of-concepts and provide valuable solutions without relying on traditional cloud providers like AWS, GCP, or Azure.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">First things first:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">What is Snowpark?\u00a0<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"auto\">Snowpark is a new feature in Snowflake, a cloud-based data warehousing platform. It allows developers to write complex data transformations, analytics, and machine learning models using familiar programming languages like Java, Scala, and Python. This allows for complex data processing and analysis to be performed directly within Snowflake, without the need to move data between different systems or languages.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">What is Streamlit?\u00a0<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"auto\">Streamlit is an open-source framework that simplifies the process of creating web applications. As it is built on top of Python, users can easily leverage the multiple pre-built components Streamlit offers \u2013 including text inputs, buttons, and sliders. Additionally, users can use data visualisation libraries \u2013 like Matplotlib and Plotly \u2013to create interactive and dynamic graphs.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p>With this being the first time I used Snowpark, I wanted to explore this new technology and figure out where and how it would be used within the data science realm. The best way to find this out, was to use Snowpark and Streamlit first hand and see how easy or hard it will be to use.<\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">The Demo:<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"auto\">I created a demo to provide first-hand insights into how easy or hard it may be to use Snowpark and Streamlit. This demo involved scraping tweets from a particular Twitter user and running sentiment analysis on the tweets to generate valuable insights into the user&#8217;s profile.\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Using Streamlit, I presented the results in an interactive manner; displaying information such as the user&#8217;s sentiment trends and the topics they most frequently tweet about.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-84514\" src=\"https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Slide1-1.jpeg\" alt=\"\" width=\"720\" height=\"405\" srcset=\"https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Slide1-1.jpeg 720w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Slide1-1-300x169.jpeg 300w\" sizes=\"auto, (max-width: 720px) 100vw, 720px\" \/><\/p>\n<p><span data-contrast=\"none\">\u00a0<\/span><span data-ccp-props=\"{&quot;335551550&quot;:2,&quot;335551620&quot;:2}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">The process:<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:40}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"auto\">The first step was to find a tweet scraping tool that would allow me to retrieve a large number of tweets in a single query.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">After some research, I decided to use <\/span><a rel=\"external nofollow\" target=\"_blank\" href=\"https:\/\/github.com\/JustAnotherArchivist\/snscrape\"><span data-contrast=\"none\">SNScrape<\/span><\/a><span data-contrast=\"auto\"> \u2013 a scraping tool for social networking services that can scrape user profiles, hashtags, and searches, and return discovered items like relevant posts.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">To retrieve the tweets, it\u2019s as simple as just a line of code \u2013 this line of code retrieves the tweets and inserts it into a json file.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><em>snscrape &#8211;jsonl twitter-user {username} &gt;<a rel=\"external nofollow\" target=\"_blank\" href=\"mailto:twitter-@%7busername%7d.json\">twitter-@{username}.json<\/a>\u00a0<\/em><\/p>\n<p><span data-ccp-props=\"{&quot;469777462&quot;:[916,1832,2748,3664,4580,5496,6412,7328,8244,9160,10076,10992,11908,12824,13740,14656],&quot;469777927&quot;:[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],&quot;469777928&quot;:[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">The next step was to clean and prepare the tweets. Once cleaned, these tweets were ready for analysis and helped improve the accuracy of the sentiment prediction model.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Cleaning the tweets included:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"3\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><span data-contrast=\"auto\">Removing any irrelevant information (URLS, usernames and hashtags).<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"3\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">Converting the text to lowercase and reducing the number of unique words.<\/span><\/li>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"3\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">Removing special characters and punctuation marks.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"3\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">Removing stop words (for example: the, and, a, etc.)<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"3\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">Tokenise the text \u2013 this is a common technique used in data cleaning to pre-process text data by breaking it down into smaller units called tokens, typically words or phrases.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/li>\n<\/ul>\n<p><span data-contrast=\"auto\">This can all be done using a User Defined Function (UDF) \u2013 a custom function created by the user in Snowpark to perform complex computations or data transformations that are not natively available in Snowflake. UDFs are written in a supported programming language and are executed within a Snowpark session. They take input parameters and return values and can be reused across multiple sessions.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Using UDFs allowed me to shift processing power to Snowflake&#8217;s warehouses, saving my laptop from performing queries. Snowflake&#8217;s cloud-native architecture can execute UDFs and handle query execution and data processing efficiently.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Using UDFs is super easy \u2013 for example, you can create a simple UDF like so:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><em>#######Define the function for the UDF\u00a0<\/em><\/p>\n<p><em>def multiply_by_three(input_int_py: int):\u00a0<\/em><\/p>\n<p><em>return input_int_py*3\u00a0<\/em><\/p>\n<p><em>\u00a0<\/em><\/p>\n<p><em>#######Upload UDF to Snowflake\u00a0<\/em><\/p>\n<p><em>session.udf.register(\u00a0<\/em><\/p>\n<p><em>func = multiply_by_three\u00a0<\/em><\/p>\n<p><em>, return_type = IntegerType()\u00a0<\/em><\/p>\n<p><em>, input_types = [IntegerType()]\u00a0<\/em><\/p>\n<p><em>, is_permanent = True\u00a0<\/em><\/p>\n<p><em>, name = &#8216;SNOWPARK_MULTIPLY_INTEGER_BY_THREE&#8217;\u00a0<\/em><\/p>\n<p><em>, replace = True\u00a0<\/em><\/p>\n<p><em>, stage_location = &#8216;@TWITTER_DEMO&#8217;\u00a0<\/em><\/p>\n<p><em>)\u00a0<\/em><\/p>\n<p><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">In this way, it is easy to create functions that you can reused as many times as you like. In this example, I created user defined functions for cleaning the tweets and performed the techniques mentioned above to clean the text. I then used TextBlob to accelerate how quickly I could predict the sentiment of the text.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">After cleaning and using Textblob on the tweets, the plan was to insert the data into the Snowflake tables. It was as easy as:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><em>session = Session.builder.configs(&lt;connection_parameters&gt;).create()\u00a0<\/em><\/p>\n<p><em>temp_df = session.create_dataframe(&lt;data&gt;)\u00a0<\/em><\/p>\n<p><em>temp_df.write.mode(&#8220;overwrite&#8221;).save_as_table(&lt;table_name&gt;)\u00a0<\/em><\/p>\n<p><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">I stored the data in Snowflake tables to optimise the app&#8217;s performance. By doing so, if a user queries the same user multiple times, their data will already be stored, and only new tweets will be added to the table. This approach significantly improves the speed of the app as it eliminates the need to clean and predict the sentiment of all tweets repeatedly whenever the same user is queried.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Once the data had been inserted into Snowflake, we could now query from it to send the data to our frontend, which I used Streamlit to present.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Using Python, I was able to create webpage components, generate plots and graphs, and develop a dashboard directly from Snowflake.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Streamlit made the process straightforward, and since I had used Python before, the coding seemed very intuitive. It simplifies app development by allowing you to write apps in the same way you write plain Python scripts. For instance, if you wanted to display plain text, using <\/span><em>st.write(&#8216;hello world&#8217;)<\/em><span data-contrast=\"auto\"> would immediately show the text on the app. This approach to app development makes it easy to build interactive applications without the need for extensive coding knowledge.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">Cloud vs local:<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:40}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"auto\">I found Streamlit&#8217;s dynamic nature to be very useful. It automatically detected changes in my source code, enabling the app to instantly rerun and reflect any updates in real-time. This made the development process efficient, as I could quickly iterate on changes and instantly see the results without any lag time.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">There are two ways this works: first is to run Streamlit locally, this takes your source code directly from local directory; and then there is Streamlit cloud, which allows you to connect your GitHub to the Streamlit cloud and every time there is a push to the repository, it will update the application very quickly.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Here is a quick view of the application that I was able to make with the data I inserted into Snowflake. You can see in the app that there are customisable input fields and the ability to generate dynamic visualisations such as pie charts, line plots, and even word clouds.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:true,&quot;134233118&quot;:true}\">\u00a0<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-84523\" src=\"https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113403.png\" alt=\"\" width=\"711\" height=\"489\" srcset=\"https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113403.png 1560w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113403-300x206.png 300w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113403-1024x704.png 1024w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113403-768x528.png 768w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113403-1536x1056.png 1536w\" sizes=\"auto, (max-width: 711px) 100vw, 711px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-84524\" src=\"https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113611.png\" alt=\"\" width=\"697\" height=\"467\" srcset=\"https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113611.png 1600w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113611-300x201.png 300w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113611-1024x686.png 1024w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113611-768x515.png 768w, https:\/\/www.globallogic.com\/uki\/wp-content\/uploads\/sites\/4\/2023\/07\/Screenshot-2023-07-27-at-113611-1536x1029.png 1536w\" sizes=\"auto, (max-width: 697px) 100vw, 697px\" \/><\/p>\n<p><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">Lessons learnt:<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"auto\">Snowpark is still in its early stages of development, so there may be a limited ecosystem of libraries and resources available for developers to utilise.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">When it came to using a UDF to predict the sentiment, I had tried to just use TextBlob to predict the sentiment, however this couldn&#8217;t be used within a UDF. Although the UDFs <\/span><i><span data-contrast=\"auto\">do <\/span><\/i><span data-contrast=\"auto\">support the use of third-party packages from Anaconda in a UDF \u2013 including Scikit-learn, Keras, TensorFlow, and other packages which can be useful for a Data Scientist.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Whilst working with Streamlit I found that even though it is a powerful tool for building data-driven applications, it may not be suitable for all use cases. Some users may require more advanced features or customisation options than Streamlit provides, which could limit its usefulness for these users.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">For instance, it currently doesn\u2019t integrate with Python Notebooks, which may be a drawback for some users. However, for creating simple proof-of-concept applications, Streamlit can be very effective and easy to use. It is important to consider your specific needs and requirements when deciding whether Streamlit is the right tool for your project.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">My thoughts on these technologies:<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:0}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"auto\">My experience working with Snowpark and Streamlit was nothing short of impressive, particularly for a Data Scientist looking to create rapid demos to showcase their models.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">I found the integration of the two platforms to be useful, as it allowed me to easily build and showcase data visualisations and models without requiring extensive development time.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">While there may be some costs associated with using Snowpark or Snowflake, the performance gains and scalability benefits often outweigh these expenses. Similarly, while Streamlit may have some limitations in terms of functionality and flexibility, its ability to rapidly prototype data-driven applications makes it an invaluable tool for data scientists and developers alike.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Ultimately, the decision to use these technologies will depend on the specific needs and constraints of each project, but they are certainly worth considering as part of any modern data stack.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Our team here at GlobalLogic are highly experienced and love working all things Data Science and MLOps. If you are interested in working with us or just want to chat more about these topics in general, please feel free to contact Dr Sami Alsindi, Lead Data Scientist, at <\/span><a rel=\"external nofollow\" target=\"_blank\" href=\"mailto:sami.alsindi@globallogic.com\"><span data-contrast=\"none\">sami.alsindi@globallogic.com<\/span><\/a><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h4><span data-contrast=\"none\">More about the author:<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:40}\">\u00a0<\/span><\/h4>\n<p>Janki Makwana, a Data Scientist at GlobalLogic, has been a member of the GlobalLogic team in the UK&amp;I region for two years now. She started as an associate consultant before proceeding to move to the Data Science Team. For the past year, Janki is working with a major retail and commercial bank as a subject matter expert in MLOps.<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>I\u2019m Janki Makwana, a Data Scientist at GlobalLogic. I have been working with a major retail and commercial bank for the past year as a subject matter expert.\u00a0 Throughout this blog, I aim to shed light on the remarkable potential that emerges when we use Snowpark and Streamlit together for solutions in Data Science. I will explore the seamless integration of these two technologies and demonstrate how they can be effectively utilised.\u00a0\u00a0 By sharing practical examples of using these technologies, I will show how Snowpark and Streamlit enable the creation of proof-of-concepts and provide valuable solutions without relying on traditional cloud providers like AWS, GCP, or Azure.\u00a0 First things first:\u00a0 \u00a0 What is Snowpark?\u00a0\u00a0 Snowpark is a new feature in Snowflake, a cloud-based data warehousing platform. It allows developers to write complex data transformations, analytics, and machine learning models using familiar programming languages like Java, Scala, and Python. This allows for complex data processing and analysis to be performed directly within Snowflake, without the need to move data between different systems or languages.\u00a0 \u00a0 What is Streamlit?\u00a0\u00a0 Streamlit is an open-source framework that simplifies the process of creating web applications. As it is built on top of Python, users can&#8230;<\/p>\n","protected":false},"author":38,"featured_media":84522,"parent":0,"menu_order":169,"template":"","insight":[41],"insight-subcats":[794,795],"insight-industry":[750],"insight-services":[],"insight-partners":[],"class_list":["post-84403","insightsection","type-insightsection","status-publish","has-post-thumbnail","hentry","insight-blogs","insight-subcats-analytics","insight-subcats-data-engineering","insight-industry-technology"],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insightsection\/84403","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insightsection"}],"about":[{"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/types\/insightsection"}],"author":[{"embeddable":true,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/users\/38"}],"version-history":[{"count":1,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insightsection\/84403\/revisions"}],"predecessor-version":[{"id":103406,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insightsection\/84403\/revisions\/103406"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/media\/84522"}],"wp:attachment":[{"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/media?parent=84403"}],"wp:term":[{"taxonomy":"insight","embeddable":true,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insight?post=84403"},{"taxonomy":"insight-subcats","embeddable":true,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insight-subcats?post=84403"},{"taxonomy":"insight-industry","embeddable":true,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insight-industry?post=84403"},{"taxonomy":"insight-services","embeddable":true,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insight-services?post=84403"},{"taxonomy":"insight-partners","embeddable":true,"href":"https:\/\/www.globallogic.com\/uki\/wp-json\/wp\/v2\/insight-partners?post=84403"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}