Making Sense of Big Data

A simple explanation of how to run parallel workloads to process big data

I decided to write a series on distributed computing, a historical review on where we succeeded in handling big data properly, and how it evolved. The challenge with processing big files was that our computation running on distributed machines had to finish within a reasonable time, and this brought the questions on how we can do embarrassingly parallel computations, how to distribute the data, and how we handle the failures. …


A comprehensive guide for conda, from choosing the installer to setting up environments, channels and installing packages

Photo by veeterzy on Unsplash

Motivation:

Hello! Conda is one of the most popular tools at data science community, and yet, it can be confusing to understand the steps and the cost of implementing that step, as there is hardly a single place explains, so I decided to write one up.

I will focus on three topics, the first one is about conda installer options, Anaconda, miniconda, and miniforge, what you will be missing by not using one. The second topic will be about setting up an environment, you can reliably use for multiple projects, and how to modify when you need more configuration. …


A summary of Dr. Stylianos Kampakis’ book on Data Science, covers the history, now, with great insights on how to apply in your business, from data strategy to data management, from hiring to building the data culture.

Photo by Robin Spielmann on Unsplash

We will be hosting Dr. Stylianos Kampakis on our Agile Data London meetup this month, and I wanted to give a brief summary on his book as I found it very useful for many practical reasons, join us if you would like to ask your questions live.

Building a data science team can be tricky, from the idea creation to setting up the team, and going to live, and Kampakis’ book can help you to start your journey with good questions. Rather than a summary of chapter by chapter, I will go through the top three topics on his Data…


Let’s explore how an evidence-based decision-making system can provide the best outcome, and what is the difference between Predictive and Prescriptive Analytics

Decision making is complicated, but AI/ML is here to enhance the Prescriptive Analytics, credit: Canva

We want high-quality business analytics, a better “process of discovering meaningful and actionable insight in data” as Prof Dursun Delen describes in his book Prescriptive Analytics, as we want to make better decisions and get better at problem-solving by using mathematical/statistical models or machine learning/deep learning algorithms, technologies, tools and practices.

There have been many attempts to suggest a rational human decision-making process, and the best one was by Herbert Alexander Simon (1916–2001), who was an American economist and political scientist, received the Turing Award in 1975, and Nobel Prize in Economics in 1978.

For a successful outcome, the decision-making…


As a leader, “you” are in charge of your teams’ morale

Photo by Matthew Fournier on Unsplash

When you are in charge of a team/teams, you need more attention and focus on what you do and tell as the impact you have on them will have the ripple effect. Your team/direct reports will easily, involuntarily, subconsciously copy your approach, your insight and your mental blocks, even if they know it is not the case. So, I decided to write a series of topics what makes a great leader. Morale has such a great impact on motivation and performance, personally, it is the most important fact to predict a team will make it or not, thus the team…


I wanted to share the sessions/what I have learnt from the AWS Data Lake Day (27/11/2019) in London. Even if Re:Invent is next week, and we are expecting tons of updates/new products, AWS was not shy to announce new features today. As they have expressed multiple times, they want to push the boundaries and improve their products by the constant feedback they receive from their customers, and definitely the half-day event proved they are on the right track.

Data is the new oil, and the amount of the data we need to handle is increasing tremendously. …

Ebru Cucen

Dataist | All about Data & AI with software sprinkles on top

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store