Archive for the ‘Book’ Category

NewImageI just recently re-read “Programming Collective Intelligence: Building Smart Web 2.0 Applications,” by Toby Segaran, as part of my ongoing research into developing a more systematic view of crowdsourcing enterprise architectures.  Part of a larger O’Reilly book series (e.g., Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media SitesThink Complexity: Complexity Science and Computational Modeling, Machine Learning for Hackers, etc.), this text “… takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general — all from information that you and others collect every day.”  This book explains:

  • Collaborative filtering techniques that enable online retailers to recommend products or media
  • Methods of clustering to detect groups of similar items in a large dataset
  • Search engine features — crawlers, indexers, query engines, and the PageRank algorithm
  • Optimization algorithms that search millions of possible solutions to a problem and choose the best one
  • Bayesian filtering, used in spam filters for classifying documents based on word types and other features
  • Using decision trees not only to make predictions, but to model the way decisions are made
  • Predicting numerical values rather than classifications to build price models
  • Support vector machines to match people in online dating sites
  • Non-negative matrix factorization to find the independent features in a dataset
  • Evolving intelligence for problem solving — how a computer develops its skill by improving its own code the more it plays a game

I had forgotten just how good this book is on making the complex subject matter of machine-kerning algorithms understandable for those crowd-sourcing practitioners. Segaran takes complex algorithms and makes them easy to understand, mostly through examples, that are directly applicable to web-based social interactions. Even in 2012, this 2008 text is still a must read for those building commercial-grade crowdsourcing solutions.

Read Full Post »

NewImageHow did Robert Ballard find the Titanic? Most people think it was by looking for it. Well, most people would be wrong. Ballard believed he could rediscovered the Titanic by looking for the debris field created when the ship sank. With the Titanic only being around 900 feet long, he hypothesized that ship parts would be spread out much wider the farther you were from the ship, narrowing like a funnel to closer one got. In essence, this much larger historical debris field would point the way to the much small artifact of interest – the Titanic.

Every physical object leaves some trace of its interaction with the real world over time – everything. Whether it is the Titanic plugging to her heath in depths in the Atlantic ocean or a lonely rock sitting in the middle of a dry desert lake bed. Everything leaves a trace; everything has a Historical Debris Field (HDF). Formally,


Definition: Historical Debris Field (HDF) is any time dependent perturbation of an object and its environment.

One of the key points is that it is an observation over time, not a just a point in time. HDF are about capturing the absolute historical changes in the environment in order to make relative projections about some object in the future.

NewImageAs it turns out, just like physical real world objects leave historical debris fields, so does data through the virtual interactions in cyber space. Data, by definition, is merely a representative abstraction of a concept or real world object, and is a direct artifact of some computational process. At some level, every known relevant piece of electronic information (these words, your digital photos, a You Tube video), boils down to a series of Zeros (0) and Ones (1), streamed to together by a complex series of implicit and tacit interacting algorithms. These algorithms are in essence the natural, often unseen forces that govern the historical debris seen in real world objects. So, the HDF for cyberspace might be defined as,

Definition: Cybernetic Historical Debris Field (CHDF) is any time dependent perturbation of data through and its information environment (information being relevant data).

NewImageWhy is this lengthy definitional expose important? Because big data represents the Atlantic ocean in which a company is looking for opportunities. Any like Robert Ballard’s search for the Titanic, one can not merely set out looking for a piece of insight or knowledge itself in the vastness of all that internal/external structured/unstructured data, one needs to look for the Cybernetic Historical Debris Fields that point to the electronic treasure. But what kind of new “virtual sonar” systems can we employ to help us?

While I will explore this concept more over time, let me suggest that the “new” new in the field of data mining will be in coupling data scientists (DS) with behavioral analysts (BA).  Data changes because at the core some human initiated a change (causal antecedent). It is through a better understanding of human behavior (patterns), that we will have the best chance of monetizing the vastness of big data. Charles Duhigg, author of “The Power Of Habit: Why We Do What We Do in Life and Business,” shows that by understanding human nature (aka our historical debris field) we can accurately predict a behavioral-based future.


For example, Duhigg shows how Target tries to hook future parents at the crucial moment before they turn into loyal buyers of pre/post natal products (e.g., supplements, diapers, strollers, etc.). Target, specifically Andrew Pole, determined while lots of people buy lotion, women on baby registries were buying larger quantities of unscented lotion. Also, women at about twenty weeks into the pregnancy, would start loading up on supplements like calcium, magnesium, and zinc. This CHDF led the Target team to one of the first behavioral patterns (virtual Titanic sonar pattern) that could discriminate (point to) pregnant from non-pregnant women. Not impressed…well this type of thinking led to Target’s $23 billion revenue growth from 2002 to 2010.

The net of all this is that data can be monetized by systematically searching for relevant patterns (cybernetic historical debris fields) in big data based on human patterns of behavior. There are patterns in everything and just because we don’t see them it doesn’t mean they don’t exist. Through data science and behavioral analysis (AKA Big Data), one can reveal the behavioral past in order to monetize the future.

Read Full Post »

NewImageBusiness Model You (BMY), written by Tim Clark in collaboration with Alexander Osterwalder and Yves Pigneur,  has just been released and is must must read/apply for anybody that is looking for personal logic through which  one can sustain themselves financially. It take the proven business modeling capabilities of Business Model Generation and applies them on a personal level, AKA to you and me. Side Bar: Tim… Please make this book available electronically to those of us who use Kindle, Nook, etc.


When I originally read the BMG book last year, I wrote about it value as well as the potential value of using the methodology on a personal level. Base on some exercises that the BMY recommends, I have since update my own model, Dr. Jerry A. Smith V2.0:

Dr Jerry A Smith bmg

Click For Larger Version

This new BMY is for my work as a IT strategist, designed to help companies grow their business (revenue, margin, customers) through technology and social media driven capabilities. While not perfect (nothing ever is), it has helped me focus on those activities, resources, and partnerships needed to better help my clients.

Read Full Post »

Thinking, Fast and Slow

NewImageJim Holt  (NY Times) wrote a great review of “Thinking, Fast and Slow,” by Daniel Kahneman. If you are interested in the psychology of decision making and executive functioning, this article and book is for you.

For those who still have some doubt, here are a few excerpts from Holt’s review that might peek your interest:


Background: Consider for a moment Linda, who is single, outspoken and very bright, and who, as a student, was deeply concerned with issues of discrimination and social justice.

Question: Which was more probable:

  1. Linda is a bank teller
  2. Linda is a bank teller and is active in the feminist movement

If you are like most people in Daniel Kahneman experiment, your response was (2); in other words, that given the background information furnished, “feminist bank teller” was more likely than “bank teller.”  This is, of course, a blatant violation of the laws of probability.

What has gone wrong here? An easy question (how coherent is the narrative?) is substituted for a more difficult one (how probable is it?). To explain this behavior, Kahneman uses two systems:

System 1 –  our fast, automatic, intuitive and largely unconscious mode. It is System 1 that detects hostility in a voice and effortlessly completes the phrase “bread and. . . . ” System 1 jumps to an intuitive conclusion based on a “heuristic” — an easy but imperfect way of answering hard questions. In general, System 1 uses association and metaphor to produce a quick and dirty draft of reality.

System 2 –  our slow, deliberate, analytical and consciously effortful mode of reasoning about the world. It is System 2 that swings into action when we have to fill out a tax form or park a car in a narrow space. System 2 lazily endorses this heuristic answer of System 1 without bothering to scrutinize whether it is logical.

In general. System 1 proposes, System 2 disposes.

Just putting on a frown, experiments show, works to reduce overconfidence; it causes us to be more analytical, more vigilant in our thinking; to question stories that we would otherwise unreflectively accept as true because they are facile and coherent.

Kahneman never grapples philosophically with the nature of rationality. He does, however, supply a fascinating account of what might be taken to be its goal: happiness. What does it mean to be happy? .

Read Full Post »

NewImageAliza Sherman has an excellent new book on the dynamics of crowdsourcing, The Complete Idot’s GUide to Crowding. While I have never been crazy about “Idiot” guides to anything, this book is far from a guide on an idiot would use. As Sherman points out, crowdsourcing leverages social networking ecosystems and tools, as Facebook and Twitter, in order to tap into the power of many people. In this guide, Sherman explains not only the the theory, but athed practice of crowdsourcing and actually shows readers how to use it. 
• A practical, prescriptive guide for those who want to put the ideas in such books as The Wisdom of Crowds and Here Comes Everybody into action.
• Step-by-step instructions. 
• Insightful anecdotes from the world of crowdsourcing.

Read Full Post »