William McKnight

Seven Big Trends Driving Big Data

By William McKnight on April 4, 2011
View Full Bio →

Basic analytics are largely being done by companies today.  Like a task on the project plan for which the team is no longer hyperfocused, basic analytics on core, structured corporate data is being marked 100% for many companies.

Some have jumped the shark on their enterprise vendors and built the analytics in house.  Others have been on the receiving end of a vendor “push” model for analytics since many enterprise vendors have pushed analytics in their packages over the last few years.

Regardless of origin, although some would say it’s more like basic reporting, scarcely a Global 2000 company would be found that does not claim some level of next-best-offer analysis, churn management, customer ranking, supply chain and product placement analysis that is based on core corporate data.  “Average order size” and the like are solved but analytics are more important than ever.  The focus is just shifting.

Most still are directing energy at improving the use of core data in these analyses, but it is not deterring them from moving forward into another level of data and consequently another level of analytics.  Once bought off by upper management as truly valuable, and not a black art, a second level of analytics is undertaken and “big data” is the key.

As well as a colloquial term meaning large amounts of data, big data can be thought of as an expanding range of content now being modeled into environments to expand the company’s opportunities.

A few trends are bringing us into the era of big data:

1. Reduced Cost to Carry Data

Companies increasingly are feeling liberated by less expensive means to extend the competitive advantages of data exploitation into clicks, weblogs and sensor reads of small movements.  Factors that are triggering the liberation include past-decade exponential decreases in the cost of all computing resources and an explosion in network bandwidth.

Cloud computing can further reduce the cost to carry by making the cost an operational expense.  We are starting to see every large corporation have a private cloud or a strong partnership enabling quick public could deployment.  It is an active part of the enterprise.

2.  New Ideas

Analyst and system capabilities have increased to take on big data.  Even though the value of each data item is far less than what is derived from core company alphanumeric data, as long as its storage passes the ROI test, it should and eventually will be captured.  Ideas and corresponding tools are also emerging to extract useful information to help organizations learn from their data.

New analytics with potential are in the areas of predicting a non-intervened future and determining and executing the interventions.   New analytics include making the right recommendations to the customer to create a more desirable future for the company.  So far, even leading companies are just scratching the surface.  You see the beginnings of these analytics everywhere, especially if you active on the internet.

3.  Sensor Networks

Most people don’t think of their cellphone as part of a “sensor network.”  However, their increased ubiquity and web connection are impossible to ignore as data collection mechanisms.  One day, we may be able to opt in to contributing our air quality, air pressure, noise level or other “around me” data to our favorite companies we do business with. Today, we contribute our location, web clicks and application usage.  Cellphones as sensors reduces effort and gatekeeping for generating big data.

4.  Machine Learning

The progress is not being metered by human analysis.  Progress is going into building systems - either commercial close source or internal - that provide machine-learned automated analysis.  

Consider Facebook, LinkedIn, Netflix and Amazon “you might likes” as well as improved spam filters.  Incorporating learning and improving algorithms are processes best suited as a combined effort of people and machines with the people providing the filtering and summarizing of the machine decisions so that people can make their decisions intelligently.

Human ingenuity is needed now more than ever to scale decision making and do filtering and summarizing so that people can make intelligent decisions.  Sometimes also missed by machine learning is a sense of priority.   

5.  Accumulation of History Data

Ability to use the data is outpacing earlier goals of many organizations to “age off” the older and “less interesting” data that they have accumulated.   Analysts routinely do longitudinal data study now and will seldom allow for the removal, or even archival, of the information they use.  This data has been accumulating over the past decade or more.

6.  Hadoop

For its approach to the problem of big data, Hadoop, and its community, stands alone.  Hadoop is an approach to process large datasets in parallel that are distributed over a clustering of machines.  I could have called this trend Even More Reduced Cost to Carry Data.

7.  Rise of a Data Marketplace

Third party data continues to gain relevance.  Occasionally it exceeds the size even of a company’s internally generated information when it gets into the area of transactions.  This data is available and can be used within reason because there has been a refocusing of data privacy onto controlling the use of data as opposed to the collection of data. 

Arguably, one day the ROI of big data may exceed that of core data as the competitive battlefield shifts.  No matter what business a company is in, it is now competing with its core information.  Increasingly, core information exploitation methods will proliferate, erode competitive advantages and shift the competitive focus to exploitation of big data.

Follow all Expert Blog updates by subscribing to the RSS RSS feed.

About the Author

William functions as Strategist, Lead Enterprise Information Architect, and Program Manager for complex, high-volume full life-cycle implementations worldwide utilizing the disciplines of data warehousing, master data management, business intelligence, data quality and operational business intelligence.

thananjayan chinnaswamy
July 26, 2011

I believe that the columnar DBs also has major role to play in the analytics space.

Name:

Email:

Comment:

What is missing: North, South, East?

Notify me of follow-up comments?