Exporting your fitted model after the training phase is the last crucial step in every Data Science Project. However, as important as it is, the methods we use to store our models weren’t specifically designed for Data Science in mind.
In fact, python’s
pickle or the well-established
joblib package, that we often use with
Scikit-learn , are general-purpose standard serialization methods that work on any python object. Therefore, they're not as optimized as we'd like them to be.
After this article, you’ll see that we can do much better in terms of memory and time efficiency. …
A new version of JupyterLab has been released at the end of December 2020. This new release has some pretty interesting new features hidden for us.
In this article, we go through some of the most important changes.
· A much simpler extension installing process
· Visual debugger
· Support for multiple languages
· Table of contents
· Improved Single Document Mode
· A Floating Command Palette
· A Visual Filter for files browsing
No more need for rebuilding JupyterLab nor having NodeJS installed to install an extension. JupyterLab extensions can now be distributed as prebuilt extensions and installed simply using
pip (or other package managers like
mamba ). Nevertheless, the old option of fetching extensions as npm packages (requiring rebuilding JupyterLab) remains available for older extensions or if you wish to use it instead. …
We recently needed to write an extension of Python’s Markdown package. For this purpose, we needed to detect the programming language of each code block to apply specific modifications. Luckily, in addition to being programming enthusiasts, we also happen to be data scientists. So we decided to use Natural Language Processing techniques to build ourselves a classification model and we will explain exactly how we did that!
Before diving into the details of how we built our model, you can try it out on your own code snippets via this demo. …
Data Science projects tend to include multiple back and forth passages between preprocessing, feature engineering, feature selection, training, testing … Juggling all of these steps, while trying multiple options or even in production environments, can get messy very fast. Fortunately, Scikit-Learn provides options that allow us to chain multiple estimators into one. In other words, a particular action like fit or predict needs only to be applied once on the whole sequence of estimators. …
Everyone has at some point experienced that moment when they looked back at some old code they wrote and had a hard time reading through it. And maybe you work in a big team with loads of code written every day and sometimes have to go through messy syntax, smashed-together huge blocks of code, and confusing function definitions. Fortunately, there are simple steps to follow to get clean readable code each time with a bit of practice and good habits.
In this article, I share with you 8 tips that helped me level up my coding skills in Python and that make a huge difference in code readability. Even though my personal experience is Python related, these tips still apply perfectly to other programming languages. …
I have recently created a scraper to help me spot the best flight deals on Kayak. And to make my work complete, I needed to send myself notifications via email using Python. (Not missing out on good deals was the goal after all, wasn’t it ?) If you landed here, you probably need to send/receive email notifications like me, create an automated newsletter, or even send scheduled reports to clients automatically. …
Every time I have used Selenium for scrapping or automation, no matter how at ease I thought I had become with it, it always manages to surprise me with new issues and challenges. If that’s your case too, you’ll find this guide through common challenges useful and very time-saving.
Also, if you haven’t seen the first part of this guide, you should definitely check it out. The solution for the challenge you might be facing may be waiting for you in the first part if it’s not listed in this one.
That been said, let’s start with the first common challenge. …
I have recently had the idea to use Selenium with Python to automatize some repetitive tasks on SAP for a client. And as it always is the case when getting your hands dirty with code, I started to come across some challenges I never saw coming. Having spent a lot of time going through the internet trying to find the most suitable solution for each issue, I thought to myself:
How nice it would’ve been if I had found everything I needed gathered in one place, ready for use ?
So, to make your lives easier, I gathered in this article, the answers to the most frequent issues a user could encounter when using Selenium along with ready to use code snippets written in Python. …
With everything going on in the world and with a post-corona virus world feeling too far away to imagine, my brain’s defense mechanism chose to do what it does best: hope for better days to come and more importantly: PLAN for it! With an image of me laying peacefully on a sandy beach, I started thinking about a way to get notified of the best flights and get the best deals delivered right to me on my mail. Now you’re probably wondering why would I build my personal notifier when there is a custom alert system on most booking websites. Well, the first reason is: because it’s FUN! And the second reason is that I can customize it to my needs (and so can you) in a way that a premade custom alert system can’t. …
I suppose since you landed here, you already know Selenium and what it is used for. But just in case, Selenium is a framework created for testing web applications but it can also be interesting to use for web scraping or automatize some repetitive and time-consuming tasks on any web application.
In a perfect world, when scraping, the website is nice and clean, popup free and you find the information you need on the main page. However, in practice, this is rarely the case. Selenium offers the possibility to interact with the site you’re scraping in order to get to the page you want. …