Machine Learning Under the Hood: Separating Signal from the Noise.

All data is a combination of signal and noise. Signal represents valuable consistent relationships that we want to learn. Noise is the random correlations and stuff that will not occur again in the future. The combination of signal and noise takes on familiar patterns or shapes that we can use to build a model.

Models can consider varying degrees of signal and noise. On end of the spectrum is the non-model which disregards signal & noise. Consider this common upsell question.

“Would you like fries with that?” This approach requires no model. It disregards signal and noise, rigidly canvassing everyone regardless of who they are or what they ordered:

“I’ll have a salad, hold the dressing. And a bottled water.”

“Would you like fries with that?”

“Carbs? Uh, no thank you.”

On the other end of the spectrum are flexible solutions that consider signal and noise. The predictions gathered are influenced by every piece of data, including outliers which can (and do) skew outcomes. These solutions can be more damaging than using no model at all. Such models lose predictive ability for new data because they are too tightly bound to their original training data. This is called overfitting.

​This post is a continuation of my last post where I went over how machine learning fits into the scope of data science. This post goes a step further to talk about how we use machine learning to separate signal from the noise. There are many machine learning algorithms. Think of them as tools in a tool box. Data scientists use these pre-built algorithms to tease models from their data sets.

A handful of these tools are based on classical statistical methods which makes them easy to interpret. If the model is being used to aid a human to make decisions it’s a good idea to develop the model with these classical methods:

  • Linear regression
  • Logistic regression
  • K-means classification
  • K-nearest neighbors
  • Hierarchical clustering
  • Naive Bayes

There are more options If there is no human involvement in the decision process; think of Netflix making a recommendation, no one reviews the recommendation before you get it. If there is no human element or if there is a high tolerance for opaque black box methods there is an additional group of modern machine learning algorithms:

  • Random Forest
  • LASSO
  • Hidden Markov Models
  • Support Vector Regression
  • Artificial Neural Networks
  • Apriori Algorithms

Discussing each of these is outside the scope of this post. But if you are interested in learning more, make sure to subscribe to my email list for data savvy professionals and get a copy of “Bull Doze Thru Bull.”

Take away. If you are evaluating a machine learning project a great question is to ask about the algorithms that were used. LASSO & Random Forest are as close to out of the box all purpose tools as you will find so they are quite common. The classical methods are a conservative choice. The modern machine learning methods are really black box solutions which means they probably tried all of them and went with the tool that performed the best in testing.

Comments

What’s the difference between business intelligence and business analytics?

So . .. are you a BI guy?

I get that often and the answer is yes, yes I am. In actuality the answer is, “well . . . sort of. Not really. It’s more machine learning.” If I get into it there is a variety of questions that start with: “What’s the difference between …”

What’s the difference between business intelligence and machine learning? Even if you google these terms it’s hard to find a good definition. For sure you will find definitions, but not meaning and context for a never ending list of terms in a jargon rich field.

Quoting a Google employee, “Everything at the company is driven by machine learning.” What does that mean exactly? Is that big data? What about data mining? How does that fit in? Is this all just fancy jargon for old school econometrics and statistics?

In the next 3 min i am going to take on the job of getting you up to speed on what all these terms mean in relation to each other. It isn’t enough to have a list of definitions, you need to understand context. That is what I will give you here. Context.

What Business intelligence is . . . and isn’t.

When you think about Business Intelligence you might confuse it for Business Analytics. Business Intelligence runs the business. Business Analytics changes the business. Intelligence directs process. Analytics directs strategy. Intelligence focuses understanding for today. Analytics focuses planning for tomorrow.

BI is real time access to data. Reporting. BI identifies current problems, solutions, and enables informed decision making. Business Analytics explores data: statistical analysis, quantitative analysis, data mining, predictive modelling among other technologies and techniques to identify trends and make predictions. But the two areas are merging as evidenced by these headlines:

  • 5 Ways Machine Learning Can Make Your BI Better
  • Machine Learning: The Real Business Intelligence
  • Machine Learning: The Future of Business Intelligence.
  • Big Data & BI Trends 2017: Machine Learning, data lakes, and Hadoop Vs Spark

How Does  Machine Learning Relate to Business Analytics? What is it?

I’ve heard machine learning described as the brains behind AI. Machine learning is the subfield of computer science that gives "computers the ability to learn without being explicitly programmed." I think of Machine Learning as a collection of pre-built algorithms for building models to predict future outcomes. Business analytics is about using those models on the execution side, putting insight into context and making things happen. In my last post I talked about the difference between data science and data savvy. Business analytics requires data savvy while machine learning is a component of data science.

What
is
Data Science?​

Data Science deals with structured and unstructured data. In principle, everything that relates to data cleaning, preparation and analysis lies within the scope of Data Science. Data science is interdisciplinary requiring training in statistics, computer science, and industry. Solo practitioners with specialization in all three areas are rare so it is common to have data science teams: a data savvy manager, an econometrician, & a developer trained in machine learning.

Traditional Research. If you know anything about analytics (or statistics) you are probably familiar with regression: “ordinary least squares”. If not I highly recommend reading the book, Freakonomics. Regression is a mathematical way of drawing a straight line that most closely fits a scatterplot of data.

Regression is the basis for econometrics which is squarely found in the arena of traditional research. As you can see on the venn diagram traditional research blends classical statistics with industry knowledge. The emphasis of traditional econometrics is to use statistical tools to determine causal relationships in data. An econometrician wants to be able to tell why something is happening in the data. They want to tell a story about why you see correlations. And they do that using different variations of the regression technique.

Software development. We all have apps that make our lives easier & more entertaining. A relative few are lucky enough to earn a living developing and/or supporting software, SaaS (Software as a Service). Traditional software development makes processes more efficient. Most of development exists around this. This field requires both Computer Science (coders) as well as industry knowledge. This space is marked by partnerships between clients and their SaaS providers. 

Developers will spend a lot of time and resources understanding their client's existing process to build solutions around industry best practices.

Machine Learning. Which brings us back to machine learning, which is probably not as familiar as software development or traditional research. When a developer uses machine learning, what does that look like? It starts with a dataset. As is the case with traditional research the first step is to prepare the data for analysis. Data prep, munging or data wrangling, as it is called is the most time intensive step. The second step is to separate the data into two parts. Two thirds to 70% of the data is used for training the model. 

One third to 30% is saved for testing the model. A machine learning modeler has a variety of tools at their disposal to build a model of relationships based on the training data. The modeler will then make predictions about the test data based on this model. The more accurate the predictions, the better the model.

Summary

At this point you should have a clear idea of what data science is: a blending of machine learning, traditional research, and software development to create predictive models. To contrast BI focuses on dashboards and reporting for the here and now. BI focuses on process. DS focuses on strategy. Data science requires a variety of advanced skill sets which makes data science teams quite common (and full stack data scientists quite rare). Business analytics on the other hand requires data savvy: a survey level understanding of data science topics with the purpose of putting these models to good use by executing on business goals.​

Among these topics a data savvy professional should be familiar with is an understanding of machine learning and the strengths and limitations of the more common algorithms used in machine learning. If you are interested in learning more, make sure to subscribe to my email list for data savvy professionals and get a copy of “Bull Doze Thru Bull.” In my next post we are going to explore these topics and get under the hood with machine learning.​

Comments

Are You Indispensable?

Seth Godin's Linchpin is for you. Your boss. Your team.

Linchpin is about leading, and change, and fear, and succeess.

You couldn't write this book ten years ago, because ten years ago, the economy wanted you to fit in. It took care of you . . . if you fit in. Now, the world wants something different. This book exposes a multi-generational conspiracy to sap your creativity.

What if you learned a different way of seeing?
A different way of giving?
A different way of making a living?
What if you could do it, without leaving your job?
(Or joining a network marketing scheme.)

A way . . . to contribute your true self and your best work.

Are you up for that?

Comments

Data Savvy Managers Have 6 Skills Tech StartUps Look For.

McKinsey says there will be a shortage of data skills in 2018. Mckinsey predicts a shortfall in meeting the demand for 1.5 Million Data Savvy Managers. Savvy managers can make use of data on the execution side, putting insight into context and making things happen.


A major hurdle to iterating and improving strategic data driven decision making is people. Data analytics is pretty straight forward; i.e. math is just that, math. It's people (humans) that is the problem. Which means people (could that be you?) are the solution. Data science relies heavily on statistical computing. Scripts and math. Algorithms. If (1) you start with good data and (2) you have a competent data scientist conduct and interpret the analysis, you still need (3) to put those results into context; make something happen. Someone has to do! Teams (doers) need to execute on insights.


Here are six skills tech startups are looking for in a data savvy manager:

Listen. Understand the problems your team, senior, & mid-level managers are facing.


Ask great questions. Frame the problem into a set of questions that, if answered, direct action. Understand (& communicate) that decisions must be made once these questions are answered.


Understand data science. Take a survey level course on data science. LinkedIn Learning offers a course that you can get through in an afternoon. When you understand the process you can ask actionable questions that lend themselves to be answered with a data model.


Evaluate alternatives. Data often suggests multiple approaches; assemble the right team that can prioritize them.


Acknowledge and mitigate bias. Team members have (and use) inherent bias. Teams that manage GroupThink will naturally make better evaluations.


Catalyze change. Communicate and empower decisions throughout the organization. Building the architecture need for changes to take place.

These six skills are crucial to developing processes that:

(1) generate meaningful questions

(2) pose those questions effectively

(3) build understanding around data driven decisions

(4) create a culture that can implement those decisions.

Data Science requires rare (specialist) qualities:

(1) an ability to take unstructured data and find order, meaning, and value.

(2) Deep analytical talent.

​Data Savvy doesn't.

To be a generalist, a data savvy manager, doesn’t. Data savvy doesn't require you to be a math expert,

learn more @ www.assume-wisely.com/data-savvy-manager












Comments

Biannual Bibliothon 2017 Blogger Book Tag

By Rho Lall​

1. What are you planning to read for the Summer Biannual Bibliothon?

2. What is your favorite genre to read in Summer?

​Most of what I read could be pejoratively labeled, " the gospel of success". I like business non-fiction. 

3.Where is your favorite place to read in the summer?

​I prefer reading paperbacks by the pool or at the beach.

4.What is your favorite challenge done in the Summer Biannual Bibliothon?

Exercising my first amendment right to read a banned book. 

5.What fictional character would you hang out in the summer if you could?

Come back to me on this one.

6.What are your plans for summer?

Going to N.Y.C. & D.C.

7.Do you have summer reading playlist,If not what would be on it?

I am re-reading my list of best business book ever for working professionals.

8.What is your favorite summer movie?

Live Free or Die Hard. The helicopter scene is the best!

9.What book do you read every summer,if not what thing do you do every summer?

B.B.Q. I smoke meat.

10.What other book tags are you planning to do this summer?

None.

5.What fictional character(s) would you hang out in the summer if you could?

That is a hard one. I don’t read a lot of fiction so the list of potentials is literally, Harry Potter, & The Hunger Games. I am going to go with the small group of people who know what Covfefe’ means.

Comments

Why Your Salary Will Always Be Below Average

By Rho Lall

Have you been on glassdoor lately? Maybe you’ve tried the Payscale salary calculator? Is your take home pay below average? Chances are it is. Use the calculator below and find out.

It’s ok . . . i’ll wait.

Did you check it out? Is that surprising? Before you start planning how to bring this up with your boss you might want to take a second look at that number. I'll explain.

Income follows a power law distribution.

There are two issues with this number. First you will run into trouble if you look for averages where there aren't any. Income follows a power law distribution.

What’s that?

If you have heard of Pareto’s 80/20 rule, that is a power distribution. For income, 80% of the income is earned by 20% of people. Don’t take that literally. 

If we plot out income (see image above) you would see a small number of people (in green) earn a disproportionately larger amount of money relative to everyone else (in yellow). 

If you try to take the average of a set of incomes (any power distribution) your average will wildly misrepresent the truth. It's going to underestimate a small number of people, and overestimate the majority. The average (in blue) makes it seem seem like higher incomes are more common than they are in actuality. Case and point:

Bill Gates walks into a bar and everyone inside becomes a millionaire . . .
 . . . on average.

Accurate, real-time salaries for thousands of careers.

So when you or someone else pulls up a report on glassdoor and circles the average salary, it is likely not telling the whole story.

But. You might ask, what if Bill Gates doesn’t walk into the bar? What if in this bar we only have locals who all work the same job. I like where your head's at. You might be onto something. But no, you’re not.

Income follows a power distribution even on a localized scale, it's just less noticeable. Let's look at SaaS Implementation Consultants in Provo, UT (see right). The average is $50,800. But look at the range. The low is $39K and the high is $78K. There are a few highly paid individuals driving the average up but most consultants probably earn less than 50K. In full disclosure I don’t know. But the point is neither do you.


The average is not representative of this sample. Let alone the salaries that were not reported.

Implementation consultant earn $50,800  in Provo, UT are on average.

Average is not the same as usual and customary.

Here is the second issue. What do you think of when I say average? When we talk averages, most people assume it's a mean. Most people would agree that average and mean are synonymous. That is not the case. An average doesn't have to be a mean. You can google the definition: a number expressing the central or typical value in a set of data, in particular:

the mode, median, or mean

When you read about an average, you could be reading about one of three different measurements. It's easy to be mislead. The government reports median income. Median is the middle number: 50% earn above median, 50% below. But what if I want to know what salary is usual and customary? What do most people make? This is the mode. If you want to get a sense of where the long tail on the power law distribution falls, the mode would work best. It will tell you what the most common salary is. That could be useful.


The lesson:

Don’t hang your hat on average salary. First, averages don’t fit the data very well. You can take the average, that doesn’t mean you should. Second, when you see an average take steps to learn what kind of average it is. Personally, I find the bookends, the high and low values of a range, to be more useful.

Do you want to learn more? If you a SaaS professional that struggles with aligning your team & getting to the truth then you have come to the right place. Find out how to use averages, bookends, and other KIPs to make better use of your data so you can . . .

Confront The Deluge of Information.

Perfect for people that want to become leaders! You don’t have to be an expert math person to be data literate - Download the FREE report.

Why would you want to learn to “Bull Doze Through Bull Sh*t”?​

  • Would you benefit from a deeper knowledge from your data? Probably.
  • Do statistics and data analysis intimidate you? It intimidates most people.
  • Do you want to be able to make use of all the data you have access to, so that you can make better business decisions? Of course you do!

Stop letting your fear of “number crunching” keep you from learning what is actually true. Sign up for my newsletter, and download my FREE Report on making sense of data without becoming a math expert!

Confront The Deluge of Information.

Bulldoze_thru_bullshit

Perfect for people that want to become leaders! You don’t have to be an expert math person to be data literate - Download the FREE report.

Why would you want to learn to “Bull Doze Through Bull Sh*t”?

Would you benefit from deeper knowledge from your data?
Probably.

Do statistics and data analysis intimidate you?
It intimidates most people.

Do you want to be able to make use of all the data you have access to, so that you can make better business decisions?
Of course you do!
Stop letting your fear of “number crunching” keep you from learning what is actually true. Sign up for my newsletter, and download my FREE Report on making sense of data without becoming a math expert! Powered by ConvertKit
Comments

What Makes The Difference

by Rho Lall​

Hey:

I want to share a story with you from my experience at BYU. It was a memorable lesson for me, and I hope you’ll get something out of it as well.

At BYU, it’s common for alumni to return as guest lecturers. In this case, one of the men in the story had returned to share his experience with us.

The story starts on a late spring afternoon, twenty-five years ago, the day two young men graduated from BYU. These men shared similar qualities. Both were better than average students, both were personable, both were returned missionaries, and both were filled with dreams and ambition for the future.

Recently, these two men returned to college for their 25th reunion.

They are still very much alike. Both happily married. Both have four children. And both, it turns out, work for the same bank.

But there is one big difference between them.

One of the men is a mid-level manager of a small department of that company. The other is president of his division.

What Made The Difference?

Why was it that one of the men was a division president, and not just a mid-level department manager like the other? What makes this kind of difference in people’s lives? Do you wonder sometimes?

It usually isn’t just native intelligence, talent, or dedication. It definitely isn’t that one person wants success, and the other doesn’t.

The difference lies in what each person knows, and how he or she makes use of that knowledge.

The man that became the division president simply made better decisions. Over time, that led to him getting more responsibility, and the status and pay that comes with it.

Our world is filled with information, and almost everyone has access to it. Your ability to make sense of that data, and to use it to make good decisions, is the best way for your set yourself apart.

Confront The Deluge of Information.

Perfect for people that want to become leaders! You don’t have to be an expert math person to be data literate - Download the FREE report.

Why would you want to learn to “Bull Doze Through Bull Sh*t”?​

  • Would you benefit from a deeper knowledge from your data? Probably.
  • Do statistics and data analysis intimidate you? It intimidates most people.
  • Do you want to be able to make use of all the data you have access to, so that you can make better business decisions? Of course you do!

Stop letting your fear of “number crunching” keep you from learning what is actually true. Sign up for my newsletter, and download my FREE Report on making sense of data without becoming a math expert!

Confront The Deluge of Information.

Bulldoze_thru_bullshit

Perfect for people that want to become leaders! You don’t have to be an expert math person to be data literate - Download the FREE report.

Why would you want to learn to “Bull Doze Through Bull Sh*t”?

Would you benefit from deeper knowledge from your data?
Probably.

Do statistics and data analysis intimidate you?
It intimidates most people.

Do you want to be able to make use of all the data you have access to, so that you can make better business decisions?
Of course you do!
Stop letting your fear of “number crunching” keep you from learning what is actually true. Sign up for my newsletter, and download my FREE Report on making sense of data without becoming a math expert! Powered by ConvertKit
Comments

Why You Shouldn’t Grade Employees’ Performance on a Curve

by Rho Lall​

If you haven't already, I highly recommend reading, "Managing Your Processes Using Averages May Be Hazardous to Your Company’s Health." from my ebook, Bull Doze Thru Bull Sh*t. And if you have questions feel free to ask. Really.

Here are a couple additional power tips:

If you remove the top ten percent of a power curve you are left with . . . a power curve.

That means you can split power distributions into leagues. In middle school, for example, I was captain of the Jr. Varsity Soccer team. I could have played varsity (meaning I could have sat on the bench for the season). My coach knew I would rather play. I felt successful as captain because relative to my JV peers I outperformed. I was happier. I contributed more in the JV league then I would have in the varsity league. You can create similar results for your team.

Another point to consider, performance is dynamic. Take the time to find the areas where you outperform. Take the time to find the areas where your team member outperform. I'd rather have a team of out-performers that excel across a variety of areas than a team of individuals competing against each other in one narrow area.

If you would like to better understand power curves, then check out, Bull Doze Thru Bull Sh*t.

You can get it for FREE, just click here.

Comments
1

#1 Best Tip to Improve Your KPI Dashboard

#1 Best Tip to Improve Your KPI Dashboard

By Rho Lall

Key Performance Indicators PDF

I Hate Averages. And You Should Too!

 

Hate is a strong word. But I do hate seeing averages used as KPIs. The problem is they are so prevalent. The only practice more prevalent is reporting on raw totals: We did this much in sales, we worked this many hours, etc. etc. (See my Key Performance Indicators PDF for a set of great examples.) Averages are terrible:

One. There are better KPIs that communicate more meaningful information.

Two. You can be taken advantage of when you rely on averages.

Did I Tell You About The Time I Almost Dated A Model?

I asked a girl for her number. She was clearly out of my league and she let me know it. I responded that she was acting like a ten when she was clearly a seven. She agreed! Then she started in on herself about how she needed a nose job. Her error? Only comparing herself to other models (not all women). She blew her nose out of proportion (double pun intended). I got her number (And didn't use it). The lesson. Don't be taken advantage of.

There are better options.

Why Averages Perform Below Average In Your KPIs.

 

Out of a group of two-hundred KPIs, I have researched the seven top KPIs for Professional Service firms. None of the seven are from taking averages. Six of them are ratios (and the seventh can be). Isn't that interesting. So what is so great about ratios?

Ratios reveal trends and makes large numbers easier to digest.

Ratios provide indicators of organizational performance.

Ratios allow me to compare apples to oranges.

 

Three Keys To Understand And Use Ratios.

 

First, ratios can be confusing because we were never taught to use ratios in a professional setting. We learned basic fractions. A half or a quarter is an intuitive number. I know what that looks like. I can imagine a pie which gives a fraction meaning. But if a ratio comes out to be 1.09, that is not intuitive. Is it 109%? 92%? Or something else all together?

Comment below on which you think is right?

 

Second, not every ratio is great. But the great ones compare two opposing metrics. Let's look at one of my top seven Professional Services KPIs. Revenue Per Employee. If you are in business then revenue is a positive. More revenue is better. More people isn't necessarily better. This ratio reflects the sensitivity between these two metrics. More revenue will drive the ratio up. More employees will drive it down. More employees will only drive the ratio up if synergies increase revenue at a greater rate. This ratio simplifies the relationship between revenue and employees down to a number. It also lets me compare two companies that are drastically different in terms of size and revenue.

 

Third. When I first started learning KPIs I spent a lot of time memorizing definitions. I tried to wrap my head around them. It was hard. I re-learned grade school fractions on Khan Academy because I thought it would help. It didn't. The memorization didn't either. For every new KPI I had to memorize a new definition. Don't waste time memorizing definitions!

There is a better way.

 

Next time . . . 

In my next blog I am going to teach you a very simple visual aid that has helped me break down ratios so I don't have to memorize definitions. Subscribe to my blog so you do not miss it! You might as well pick up my Key Performance Indicators PDF as well. It's free!

I'm shooting to have it out in about a week.

Insider Acess Before Everyone Else!

Assume Wisely provides uncommon sense on how to make better decisions.

Get a free weekly update with exclusive content. No Spam, ever!

Powered by ConvertKit

I Got Fired because I made a huge mistake. What Now?

I Got Fired Because I Made A Huge Mistake. What Now?

By Rho Lall

There's an urban legend about a young executive sitting in Tom Watson Jr's office just waiting to be fired.

The year is 1968. The exec works at IBM. His boss is Tom Watson Jr, a leader of the information revolution. The issue at hand, a series of mistakes costing several million dollars. These mistakes led to him sitting across from Watson waiting for summary dismissal.

 

"I suppose after that set of mistakes you will be wanting to fire me."

 

Watson's response is now part of MBA cannon/lore:

 

"Not at all young man, we have just spent a couple of million dollars educating you."

 

Remember that the next time you make a mistake at work. But even if they fired you, the moral is to learn the lessons.

Source: Balanced Scorecard Diagnostics: Maintaining Maximum Performance.

 

8 Lessons Leaned When I Lost My Job Because I Made A Huge Mistake.

 

1. A lot of people get fired. Don't feel bad.

2. Verify what legally happened. Employers will "lay off" employees to limit risk of wrongful terminations suits. It is also a pain in the ass that requires a lot of documents. If it comes up during the interview keep your answer brief and to the point: "I worked at (that employer) for X years. I was responsible for . . ."

3. Ask for a letter of recommendation. It can hurt your pride, but not much else. A letter of recommendation makes calling for references unnecessary. If not from your supervisor, you can get them from other people in the company.

4. Never lie to a prospective employer.

5. With respect to Unemployment Insurance Benefits, do your homework. Rules and/or how they are carried out varies state to state.

6. Don't be a victim. It takes two to tango. If your side of the story is a version of "It's all their fault", you are only fooling yourself. Advocate for yourself, by avoid blaming, and be objective.

7. Learn from the experience. Take time for yourself to write down what you have learned from the experience. Share (even if it is only with yourself) the wisdom about yourself and your abilities that you have gained from this experience.

8. Remember, you are not Justine Sacco.

 

Justine Sacco used to work as the PR Director for InterActiveCorp (IAC). In 2013 she tweeted to a following of under 200 people,

 

 

Then she boarded her flight and turned off her phone for the eleven hour flight. When she turned it back on there was a message from her friend, "You need to call me right now! You are the number one world wide trending topic on twitter."

Jon Ronson does a fantastic job of telling her story:

 

 

Ronson doesn't finish the story. The story ends where it begun with the Gawker editor who first published Sacco's tweet, Sam Biddle:

Justine Sacco has a PR job she enjoys now, but she deserves the best and biggest PR job, whatever that may be. Give it all to her. Justine Sacco is the most qualified person in her entire field. She has the expertise of ten lifetimes when it comes to dealing with bad press. She survived a genuine personal crisis. She's unkillable, and smart, and she will tell you to shut up, idiot, it can't get any worse.

Learn from Justine. Learn from the experience.

Insider Acess Before Everyone Else!

Assume Wisely provides uncommon sense on how to make better decisions.

Get a free weekly update with exclusive content. No Spam, ever!

Powered by ConvertKit
1 2 3 7