# Becoming a Functional Data Scientist

Hello World,

So today, I was asked to put some thought into what we should focus our entry level data scientists on in terms of tech skills.  After I put a bunch of thought into it, I ended up coming up with this.  I decided that the most important aspect of this was a few items fold

1. Don’t overload them
2. Can deliver to production where the target can be anything, including IoT.
3. They will not be concerned with building front ends.

I have to say, the result greatly surprised me.

# Microsoft Tech and Robotics

Hello World,

This article is a high level discussion on where you might use various Microsoft Technologies in the field of robotics.  I will begin with a side pet project I’m kicking off to get more familiar with some cool tools and tech I’ve lately discovered so I can hopefully get assigned to some really cool projects at work, including drones.

# Setting up Python and Virtual Environments in Visual Studio Code on Ubuntu

Hello World,

I’m writing this article because believe it or not, this process is a pain in the neck and not completely documented in any one place. Lets start with why in the world you would want to do this. For me, I want to use Tensor Flow and NVidia embedded robotics SDKs. Unfortunately the only supported dev environment for this is Ubuntu. Not anything against Ubuntu it just appears to be fairly unstable in comparison to Mac and Windows, but that is neither here nor there, if you want to build intelligent robots, you need these tools.

# K-Means under the hood with Python

Hello World!

This article is meant to explain how the K-Means Clustering algorithm works while simultaneously learning a little Python.

What is K-Means?

K-Means Clustering is an unsupervised learning algorithm that tells you how similar observations are by putting them into groups or “clusters”.  K-Means is often used as a discovery step on new data to discover what various categories might be and then apply something such as a k-nearest-neighbor as a classifier to it after understanding the centroid labels.  Where a centroid is the center of a “cluster” or group.

# SQL Saturday 524 Slides – Analytical Computing – Device to Cloud and Back

Hello World!

Here you can find the slide deck for my talk for SQL Saturday #524 in South Florida.  I hope you find this useful, but it should be far more useful if you attend in person.  This talk lays the foundation for building and understanding analytical computing for cloud and devices as well as how they work together.

https://drcdata.blob.core.windows.net/slidedecks/Analytics_Device_To_Cloud_And_Back.pptx

# Miami Data Science – R Fundamentals Talk

Hello World,

Here is the slide deck link for the talk from last night. https://drcdata.blob.core.windows.net/slidedecks/DataScience_MSRO_DataManipulation_Visualization.pptx

Also you can find the free video series here: https://aka.ms/rjumpstart

Finally, don’t forget to tweet me @Data4Bots if you want into the slack channel.

# My Production Data Science Workflow

Hello World,

So I’ve spent a while now looking at 3 competing languages and I did my best to give each one a fair shake. Those 3 languages were F#, Python and R. I have to say it was really close for a while because each language has its strengths and weaknesses. That said, I am moving forward with 2 languages and a very specific way I use each one. I wanted to outline this, because for me it has taken a very long time to learn all of the languages to the level that I have to discover this and I would hate for others to go through the same exercise.

# Machine Learning Study Group Recap – Week 4

Hello World,

So here we go with another recap. This week we did a deep dive into binary classification using Logistic Regression. Logistic regression and binary classification is the underpinnings for modern neural networks so a deep and complete understanding of this is necessary to be proficient in machine learning.

# Sigmoid for Classifiers Decoded

Hello World,

Sigmoid really isn’t that complicated (once your understand it of course).  Some back knowledge in case you are coming at this totally fresh is that the Sigmoid function is used in machine learning primarily as a hypothesis function for classifiers.  What is interesting is that this same function is used for binary classifiers, multi-class classifiers and is the backbone of modern neural networks.

Here is the sigmoid function:    $\frac{ 1 }{ 1 + e^{-z}}$

# Categories of Analytics

Hello World,

This is a high level article geared for general consumption of the normal individual! I’ve been thinking about types of customer engagements I have been doing lately and decided to break it down into a series of categorical engagements. There are 4 categories of engagements: Descriptive, Predictive, Prescriptive and Actuated Analytics engagements.