Getting Started

Grid Search, Successive Halving & Bayesian Grid Search …

Photo by Markus Winkler on Unsplash

In every Data Science project, it is possible and recommended to search the hyperparameter space to get the best performance metric. Finding the best hyperparameter combination is a step you wouldn’t want to miss as it might give your well-conceived model the final boost it needs.

Many of us, default to using the well established GridSearchCV implemented in Scikit-learn. However, the truth is that alternative optimization methods might be more suitable depending on the situation. In this article, we go through five options with in-depth explanations for each and a guide for how to use them in practice.

Table Of…


VIDEO TUTORIAL

With around 97% accuracy

Written by: Amal Hasni & Dhia Hmila

Photo by Sharon McCutcheon on Unsplash

We recently needed to write an extension of Python’s Markdown package. For this purpose, we needed to detect the programming language of each code block to apply specific modifications. Luckily, in addition to being programming enthusiasts, we also happen to be data scientists. So we decided to use Natural Language Processing techniques to build ourselves a classification model and we will explain exactly how we did that!

Before diving into the details of how we built our model, you can try it out on your own code snippets via this demo. …


Composite Estimators and Transformers

Written By: Amal Hasni & Dhia Hmila

Chain image representing a Scikit-Learn Pipeline and Composites
Photo by Sandy Millar on Unsplash

Data Science projects tend to include multiple back and forth passages between preprocessing, feature engineering, feature selection, training, testing … Juggling all of these steps, while trying multiple options or even in production environments, can get messy very fast. Fortunately, Scikit-Learn provides options that allow us to chain multiple estimators into one. In other words, a particular action like fit or predict needs only to be applied once on the whole sequence of estimators. …


Tips and Tricks

Hacking Your Way Through Jupyter

Written by: Amal Hasni & Dhia Hmila

Photo by Paul Hanaoka on Unsplash

Juypyter Notebooks are very essential tools for Data Scientists. They offer multiple practical options for interactive computing as they combine code, text, and visualizations in a single document.
It is common to choose to use multiple separate Notebooks in a single project for organizational purposes. The problem is when a manager or a client asks for a quick demo and you need to merge your different Notebooks quickly, reorganizing cells can be a long tedious sequence of copy-paste.

Since Jupyter’s interface doesn’t make it easy, we thought It’s time to create our own…


LastPass Dethroned: Switching to a better alternative in under 3 minutes

Photo by Micah Williams on Unsplash

With the exponentially increasing number of applications and accounts we use constantly, it is only natural that the number of passwords we need is getting out of control. Unless you have an eidetic memory, it can be very overwhelming to safely keep and memorize all of your passwords safely (at least, I know it is for me 😆). To make your lives easier and to facilitate your quest for the best free password manager out there, I gathered for you the results of my research in this article. …


Reflections of an actual woman in Tech about ‘positive’ discrimination and quotas

Photo by Rochelle Brown on Unsplash

I have recently had a conversation with a friend of mine about her desire to leave her current job. She pointed out to me that she’s confident she would find a more suitable position very quickly as a lot of companies are trying to increase their female quotas, especially in technical positions. Even though I knew very well that companies are trying to increase their gender equality numbers, hearing this put me face to face with the idea as if it was completely new to me.

I have always had a firm position with the idea of quotas created for…


Up to 2/3 reduction in file size

Written By: Amal Hasni & Dhia Hmila

Photo by Jonathan Pielmayer on Unsplash

Exporting your fitted model after the training phase is the last crucial step in every Data Science Project. However, as important as it is, the methods we use to store our models weren’t specifically designed for Data Science in mind.

In fact, python’s pickle or the well-established joblib package, that we often use with Scikit-learn , are general-purpose standard serialization methods that work on any python object. Therefore, they're not as optimized as we'd like them to be.

After this article, you’ll see that we can do much better in terms of memory…


Main new and upgraded features

Photo by Daniel Olah on Unsplash

A new version of JupyterLab has been released at the end of December 2020. This new release has some pretty interesting new features hidden for us.
In this article, we go through some of the most important changes.

Table Of Contents:
· A much simpler extension installing process
· Visual debugger
· Support for multiple languages
· Table of contents
· Improved Single Document Mode
· A Floating Command Palette
· A Visual Filter for files browsing

A much simpler extension installing process

No more need for rebuilding JupyterLab nor having NodeJS installed to install an extension. JupyterLab extensions can now be distributed as prebuilt extensions…


Get clean, readable, and elegant code at all times

Photo by Annie Spratt on Unsplash

Everyone has at some point experienced that moment when they looked back at some old code they wrote and had a hard time reading through it. And maybe you work in a big team with loads of code written every day and sometimes have to go through messy syntax, smashed-together huge blocks of code, and confusing function definitions. Fortunately, there are simple steps to follow to get clean readable code each time with a bit of practice and good habits.

In this article, I share with you 8 tips that helped me level up my coding skills in Python and…


Sending emails with Markdown/HTML templates

Written by: Amal Hasni & Dhia Hmila

Red post box
Photo by Bundo Kim on Unsplash

I have recently created a scraper to help me spot the best flight deals on Kayak. And to make my work complete, I needed to send myself notifications via email using Python. (Not missing out on good deals was the goal after all, wasn’t it ?) If you landed here, you probably need to send/receive email notifications like me, create an automated newsletter, or even send scheduled reports to clients automatically. …

Amal Hasni

A Data Science consultant and a technology enthusiast eager to learn and spread knowledge! /in/amal-hasni/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store