Understanding Big Data Technology (Guide)

The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s. The data is collected from a number of sources including emails, mobile devices, applications, databases, servers and other means. Data sets grow rapidly. 

This is because, they are increasingly gathered by cheaper and numerous information sensing Internet of Things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks. This data, when captured, formatted, manipulated, stored and then analyzed, can help a company to gain useful insight to increase revenues, get or retain customers and improve operations.

While the term “big data” is relatively new, the act of gathering and storing large amounts of information for eventual analysis is ages old. The concept gained momentum in the early 2000s when industry analyst Doug Laney articulated the now-mainstream definition of big data as the three Vs:

Volume: Organizations collect data from a variety of sources, including business transactions, social media and information from sensor or machine-to-machine data. The amount of data is immense. Each day 2.3 trillion gigabytes of new data is being created. While storing this amount of data would’ve been a problem in the past new technologies (such as Hadoop) have eased the burden.

Data streams in at an unprecedented speed. The speed of data (always in flux) and processing (analysis of streaming data to produce near or real time results) must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time.

Variety: Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, email, video, audio, stock ticker data and financial transactions. 

Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on." Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet search, fintech, urban informatics, and business informatics.


Is Big Data A Volume Or A Technology?

While the term may seem to reference the volume of data, that isn't always the case. The term big data, especially when used by vendors, may refer to the technology (which includes the tools and processes), that an organization requires to handle the large amounts of data and storage facilities.

An example of big data might be petabytes (1,024 terabytes) or exabytes (1,024 petabytes) of data consisting of billions to trillions of records of millions of people—all from different sources (e.g. Web, sales, customer contact center, social media, mobile data and so on). The data is typically loosely structured data that is often incomplete and inaccessible.

As of 2012, every day 2.5 exabytes (2.5×1018) of data are generated. Based on an IDC report prediction, the global data volume will grow exponentially from 4.4 zettabytes to 44 zettabytes between 2013 and 2020. By 2025, IDC predicts there will be 163 zettabytes of data. One question for large enterprises is determining who should own big-data initiatives that affect the entire organization.

In recent times, additional dimensions have been introduced when it comes to big data:

In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data.

Complexity. Since modern data sets come from multiple sources, which makes it difficult to link, match, cleanse and transform data across systems, it’s necessary to connect and correlate relationships, hierarchies and multiple data linkages to prevent the numerous streams of data to quickly spiral out of control.

This means that analyses usually have to be done on random segments of data, which allows models to be built to compare against other parts of the data. Big Data platforms and solutions provide the tools, methods and technologies used to capture, curate, store and search & analyze the data to find new correlations, relationships and trends that were previously unavailable.

Let’s look at this example:

SMEs are less likely to be able to obtain bank financing than large firms; instead, they rely on internal funds, or cash from friends and family, to launch and initially run their enterprises. The current credit gap for formal SMEs is estimated to be US$1.2 trillion; the total credit gap for both formal and informal SMEs is as high as US$2.6 trillion.

Also, improving SMEs’ access to finance and finding solutions to unlock sources of capital is crucial to enable this potentially dynamic sector to grow and provide jobs.

To break that down in simple words, let's say that the financial sector wants to know which ads work best for Small and Medium Enterprises (SMEs) on social media. Let's say there are 20,000,000 SME’s who want financing, and they have been each served 100 ads from different banks. That's 20,000,00,000 events of interest, and each "event" (an ad being served) contains several data points (features) about the ad: what was the ad for?

Did it have a picture in it? Was there a man or woman in the ad? How big was the ad? What was the most prominent color? Let's say for each ad there are 50 "features". This means you have 100,000,000,000 (hundred billion) pieces of data to sort through. That's pretty big (but still arguably not quite into "big data" territory), but you get the idea.

The goal is to figure out which features are most effective in getting SME’s who want financing to click the ads with big data that is possible. Now, pretty much any company with a significant tech group (Google, Twitter, Facebook, any bank or financial institution, any communications and mobile service, energy, etc.) are doing this kind of thing. 

To serve ads, to improve their services, to predict future growth and demand needs, whatever. In the past, technology platforms were built to address either structured OR unstructured data. The value and means of unifying and/or integrating these data types had yet to be realized, and the computing environments to efficiently process high volumes of disparate data were not yet commercially available.


 How Are Companies Benefiting From Big Data?

Large content repositories house unstructured data such as documents, and companies often store a great deal of structured information corporate systems like Oracle, SAP and NetSuite and others. Today’s organizations, however, are utilizing, sharing and storing more information in varying formats, including:

  • Email and Instant Messaging
  • Collaborative Intranets and Extranets
  • Public websites, wikis, and blogs
  • Social media channels
  • Video and audio files

Data from industrial sensors, wearables and other monitoring devices. This unstructured data adds up to as much as 85% of the information that businesses store. Regardless of the size of your business or the industry you are in, you have Big Data. The ability to extract high value from this data to enable innovation and competitive gain is the purpose of Big Data analytics. 

Conducting analytics on large sets of data, business users and executives are able to see patterns and trends in performance, new relationships between data sets and potentially new sources of revenue. Let’s look at a few examples of scenarios where Big Data solutions have helped these companies gain a competitive advantage.

Coca Cola’s Big Data Wins

Coca Cola has been a leader in the consumer packaged goods industry for over a century, and their brands are iconic. They distribute their products to a global network of retailers, have many SKU’s, and must be able to predict buyer behavior to ensure they have the right inventory, promotional ads in the marketplace and sponsoring the right events worldwide.

Coca Cola has been able to get wins with Big Data analytics by:

Selecting the ideal ingredient mix to produce juice products;

Create efficiencies in their warehousing, restaurant and retail supply chain operations;

Mining loyalty program, competitive, POS and social media data to understand buyer behavior;

Creating digital service centers for procurement and HR processes;

Leverage a new breed of storage media to retain, process and analyze vast amounts of information.

Coca Cola’s customers are in 206 countries, a vastly diverse marketplace with tens of millions of ultimate consumers. 

Effectively managing the information relating to their clients, employees, suppliers and media assets requires effective storage, powerful indexing and search functionality, and innovative solutions to make sure information can be located and used when required. Big Data solutions have provided Coca Cola with this ability.

Netflix And Big Data

Netflix uses big data to improve customer experience. To make sure its clients keep watching its programming, Netflix is constantly analyzing trends in:

  • Program viewership;
  • The content its customers are consuming;
  • The colors of the promotional visuals of its programming;
  • Devices its clients are watching its programming on;

Whether a viewer watches a portion of a movie, a season of a series, or a complete series back to back in a weekend binge watching session. For many entertainment, technology and media organizations, Big Data analytics is the key to retaining subscribers, securing advertising revenues, and understanding the sort of content to serve as it relates to geographical locations, time of day, demographics, and on opinions expressed on social media.

Big Data gives Nexflix the ability to deliver the content the customer wants to see, when the customer wants it.

Southern California Edison Gives Its Customers The Power

Providing electricity to over 5.2 million customers means managing a great deal of information on usage patterns and be able to provide actual usage data as opposed to educated guesses. Smart meters and the Big Data storage and analytics systems they integrate with have provided Southern California Edison with the ability to see trends in fifteen minute intervals instead of blocks of weeks at a time.

Consumers are demanding more transparency in their billing, to be able to understand the peaks and valleys of pricing based on utilization, and utilities like SCE want to be able to be prepared to provide reliable power supply to meet changing demand. SCE is improving its smart grid to make better use of the data which it harvests, in order to:

  • Forecast demand and utilization;
  • Identify improvement to be made in infrastructure and service delivery;
  • Provide more visibility to customers.

With Big Data, Southern California Edison gives its customers the power to view and control their electricity spend, while improving their internal ability to meet demand in a cost effective manner.


Big Data For All Companies

Big Data initiatives are rated as “extremely important” or “important” to 93% of companies over $250M, according to a 2014 Accenture Big Data Study. The opportunity to amass and capitalize on Big Data is available to any organization, large or small. The data likely exists already, distributed amongst a collection of internal repositories and files shares and/or external data sources. 

Storing and managing large amounts of data has become more affordable and manageable, enabling organizations to take full advantage of their assets. Leveraging a Big Data analytics solution can help you unlock the strategic value of this information by allowing you to:

Understand where, when and why your customers buy;

Protect your client base with improved loyalty programs;

Seize cross selling and upselling opportunities;

Provide targeted promotional information to your prospects and existing clients;

Optimize Workforce planning and operations;

Improve inefficiencies in supply chain;

Predict market trends and future needs;

Become more innovative and competitive;

Discover new sources of revenue.

Why Will Handling Big Data Become A Core Skill?

Whatever an enterprise’s big data plans are, they should definitely be long-term ones. “The ability to handle extremely large data volumes,” predicts Yvonne Genovese, Vice President and analyst at Gartner, “will become a core skill in businesses and organizations.

Increasingly, they will be looking to use new forms of information – such as text, context, and social media – to identify decision-supporting patterns. This is what Gartner calls a Pattern-Based Strategy.” A consultant with Experton Group, Steve Janata has reported that in 2011, Worldwide companies invested 3.38 billion euros in big data projects and services.

Companies are keen to expand their traditional data sets with social media data, browser logs as well as text analytics and sensor data to get a more complete picture of their customers. The big objective, in many cases, is to create predictive models. You might remember the example of U.S. retailer Target, who is now able to very accurately predict when one of their customers will expect a baby.

Wal-Mart can also predict what products will sell.

Ski resorts are even using data to understand and target their patrons. RFID tags inserted into lift tickets can cut back on fraud and wait times at the lifts, as well as help ski resorts understand traffic patterns, which lifts and runs are most popular at which times of day, and even help track the movements of an individual skier if he were to become lost.

Imagine being an avid skier and receiving customized invitations from your favorite resort when there's fresh powder on your favorite run, or text alerts letting you know when the lift lines are shortest. 

They've also taken the data to the people, providing websites and apps that will display day's stats of people, from how many runs they slalomed to how many vertical feet they traversed, which they can then share on social media or use to compete with family and friends.
Even government election campaigns can be optimised using big data analytics. Some believe Obama's win after the 2012 presidential election campaign was due to his team's superior ability to use big data analytics.

Optimising Business Processes With Big Data

Big data is also increasingly used to optimise business processes. Retailers are able to optimise their stock based on predictions generated from social media data, web search trends and weather forecasts.

One particular business process that is seeing a lot of big data analytics is supply chain thus from the manufacturers to consumers. Here, geographic positioning and radio frequency identification sensors are used to track goods or delivery vehicles and optimise routes by integrating live traffic data, etc. 

Armed with insight that big data can provide, manufacturers can boost quality and output while minimizing waste – processes that are key in today’s highly competitive market. More and more manufacturers are working in an analytics-based culture, which means they can solve problems faster and make more agile business decisions. 

Retailers on the other hand, need to know the best way to market to customers, the most effective way to handle transactions, and the most strategic way to bring back lapsed business. Big data remains at the heart. HR business processes are also being improved using big data analytics. This includes the optimisation of talent acquisition - Moneyball style - as well as the measurement of company culture and staff engagement using big data tools.

For example, one company, Sociometric Solutions, puts sensors into employee name badges that can detect social dynamics in the workplace. The sensors report on how employees move around the workplace, with whom they speak, and even the tone of voice they use when communicating.

One of the company's clients, Bank of America, noticed that its top performing employees at call centers were those who took breaks together. They instituted group break policies and performance improved 23 percent.

You may have seen the RFID tags you can attach to things like your phone, your keys, or your glasses, which can then help you locate those things when they inevitably get lost. But suppose you could take that technology to the next level and create smart labels that could stick on practically anything. 

Plus, they can tell you a lot more than just where a thing is; they can tell you its temperature, the moisture level, whether or not it's moving, and more. This part of the Internet of Things holds incredible promise for improving everything from logistics to health care.

Suddenly, this unlocks a whole new realm of "small data;" if big data is looking at vast quantities of information and analysing it for patterns, then small data is about looking at the data for an individual product - say, a container of yogurt in a shipment - and being able to know if it's likely to go off before it reaches the store.

Big data is not just for companies and governments but also for all of us individually. We can now benefit from the data generated from wearable devices such as smart watches or smart bracelets. Take the Up band from Jawbone as an example: the armband collects data on our calorie consumption, activity levels, and sleep patterns. While it gives individuals rich insights, the real value is in analysing the collective data.

In Jawbone's case, the company now collects 60 years worth of sleep data every night. Analysing such volumes of data will bring entirely new insights that it can feed back to individual users. Another area where big data is used effectively is finding love. Most online dating sites apply big data tools and algorithms to help find the most appropriate matches.

Additionally, the computing power of big data analytics enables DNA strings decoding in minutes and allows us to find new cures and better understand and predict disease patterns. Just think of what happens when all the individual data from smart watches and wearable devices can be used to apply it to millions of people and their various diseases. 

The clinical trials of the future won't be limited by small sample sizes but could potentially include everyone. Apple's new health app, called ResearchKit, has effectively just turned an iPhone into a biomedical research device. Researchers can now create studies through which they collect data and input from users phones to compile data for health studies. 

An iPhone might track how many steps you take in a day, or prompt you to answer questions about how you feel. It's hoped that making the process easier and more automatic will dramatically increase the number of participants a study can attract as well as the fidelity of the data. 

Some hospitals, like Beth Israel, are using data collected from a cell phone app, from millions of patients, to allow doctors to use evidence-based medicine as opposed to administering several medical/lab tests to all patients who go to the hospital. A battery of tests can be efficient but they can also be expensive and usually ineffective.

Big data techniques are already being used to monitor babies in a specialist premature and sick baby unit. By recording and analyzing every heartbeat and breathing pattern of every baby, the unit is able to develop algorithms that can now predict infections 24 hours before any physical symptoms appear. That way, the team can intervene early and save fragile babies in an environment where every hour counts.

From babies’ records to prescription information when it comes to health care, everything needs to be done quickly, accurately – and, in some cases, with enough transparency to satisfy stringent industry regulations. When big data is managed effectively, health care providers can uncover hidden insights that improve patient care.

What's more, big data analytics allow us to monitor and predict the developments of epidemics and disease outbreaks. Integrating data from medical records with social media analytics enables us to monitor flu outbreaks in real-time, simply by listening to what people are saying, i.e. "Feeling rubbish today - in bed with a cold".

Free public health data and Google Maps have been used by the University of Florida to create visual data that allows for faster identification and efficient analysis of healthcare information, used in tracking the spread of chronic disease.

Furthermore, most elite sports have now embraced big data analytics. With the IBM SlamTracker tool for tennis tournaments; video analytics is used to track the performance of every player in a football or baseball game, and sensor technology in sports equipment such as basket balls or golf clubs help get feedback (via smart phones and cloud servers) on game and how to improve it. 

Many elite sports teams also track athletes outside of the sporting environment - using smart technology to track nutrition and sleep, as well as social media conversations to monitor emotional wellbeing.

The NFL has developed its own platform of applications to assist all 32 teams in making the best decisions based on everything from the condition of the grass on the field, to the weather, to statistics about an individual player's performance while in university. It is all in the name of strategy as well as reducing player injuries.

When it comes to personal wellbeing, one of the really cool new things across is a smart yoga mat: sensors embedded in the mat is able to provide feedback on postures, score practice, and even help guide an at-home practice.

Big data analytics is also improving science research and education. Science research is currently being transformed by the new possibilities big data brings. Take, for example, CERN, the nuclear physics lab with its Large Hadron Collider, the world's largest and most powerful particle accelerator. Experiments to unlock the secrets of our universe - how it started and works - generate huge amounts of data. The CERN data center has 65,000 processors to analyse its 30 petabytes of data. However, it uses the computing powers of thousands of computers distributed across 150 data centers worldwide to analyse the data. Such computing powers can be leveraged to transform so many other areas of science research.

The computing power of big data could also be applied to any set of data, opening up new sources to scientists. Census data and other government collected data can more easily be accessed and analyzed by researchers to create bigger and better pictures of our health and social sciences.

Big data is used quite significantly in higher education. For example, The University of Tasmania. An Australian university with over 26000 students, has deployed a Learning and Management System that tracks among other things, when a student logs onto the system, how much time is spent on different pages in the system, as well as the overall progress of a student over time.

In a different use case of the use of big data in education, it is also used to measure teacher’s effectiveness to ensure a good experience for both students and teachers. Teacher’s performance can be fine-tuned and measured against student numbers, subject matter, student demographics, student aspirations, behavioral classification and several other variables.

On a governmental level, the Office of Educational Technology in the US Department of Education, is using big data to develop analytics to help course correct students who are going astray while using online big data courses. Click patterns are also being used to detect boredom.

Not only that, Big data analytics help machines and devices become smarter and more autonomous. For example, big data tools are used to operate Google's self-driving car. With GPS, self-driving cars safely drive on the road without the intervention of human beings. We can even use big data tools to optimise the performance of computers and data warehouses.


Xcel Energy initiated one of the first ever tests of a " smart grid" in Boulder, Colorado, installing smart meters on customers' homes that would allow them to log into a website and see their energy usage in real time. The smart grid would also theoretically allow power companies to predict usage in order to plan for future infrastructure needs and prevent brown out scenarios. 

In Ireland, grocery chain Tescos has its warehouse employees wear armbands that track the goods they take from the shelves, distributes tasks, and even forecasts completion time for a job. Big data is also applied heavily in improving security and enabling law enforcement. 

The National Security Agency (NSA) uses big data analytics to foil terrorist plots. Others use big data techniques to detect and prevent cyber attacks. Police forces use big data tools to catch criminals and even predict criminal activity and credit card companies use big data use it to detect fraudulent transactions.

Further, big data is used to improve many aspects of our cities and countries. For example, it allows cities to optimize traffic flows based on real time traffic information as well as social media and weather data. A number of cities are currently piloting big data analytics with the aim of turning themselves into Smart Cities, where the transport infrastructure and utility processes are all joined up. 

Where a bus would wait for a delayed train and where traffic signals predict traffic volumes and operate to minimize jams. The city of Long Beach, California is using smart water meters to detect illegal watering in real time and have been used to help some homeowners cut their water usage by as much as 80 percent.

Los Angeles also uses data from magnetic road sensors and traffic cameras to control traffic lights and thus the flow (or congestion) of traffic around the city. The computerized system controls 4,500 traffic signals around the city and has reduced traffic congestion by an estimated 16 percent.

A tech startup called Veniam is testing a new way to create mobile wi-fi hotspots all over the city in Porto, Portugal. More than 600 city buses and taxis have been equipped with wifi transmitters, creating the largest free wi-fi hotspot in the world. Veniam sells the routers and service to the city, which in turn provides the wi-fi free to citizens, like a public utility. 

In exchange, the city gets an enormous amount of data - with the idea being that the data can be used to offset the cost of the wi-fi in other areas. For example, in Porto, sensors tell the city's waste management department when dumpsters are full, so they don't waste time, man hours, or fuel emptying containers that are only partly full. 

When government agencies are able to harness and apply analytics to their big data, they gain significant ground when it comes to managing utilities, running agencies, dealing with traffic congestion or preventing crime.

In the insurance industry, Big Data has been used to provide customer insights for transparent and simpler products, by analyzing and predicting customer behavior through data derived from social media, GPS-enabled devices and CCTV footage. The big data also allows for better customer retention from insurance companies.

When it comes to claims management, predictive analytics from big data has been used to offer faster service since massive amounts of data can be analyzed especially in the underwriting stage. Fraud detection has also been enhanced. Through massive data from digital channels and social media, real-time monitoring of claims throughout the claims cycle has been used to provide insights.

Financial trading has not been left out when it comes to Big Data. High-Frequency Trading (HFT) is an area where big data finds a lot of use today. Here, big data algorithms are used to make trading decisions.

Today, the majority of equity trading now takes place via data algorithms that increasingly take into account signals from social media networks and news websites to make, buy and sell decisions in split seconds. Computers are programmed with complex algorithms that scan markets for a set of customizable conditions and search for trading opportunities. 

The programs can be designed to work with no human interaction or with human interaction, depending on the needs and desires of the client. The most sophisticated of these programs are now also designed to change as markets change, rather than being hardcoded. With large amounts of information streaming in from countless sources, financial traders are faced with finding new and innovative ways to manage big data. Big data brings big insights.


How Does Applications Of Big Data Drive Industries?

Industry influencers, academicians, and other prominent stakeholders certainly agree that big data has become a big game changer in most, if not all, types of modern industries over the last few years. As big data continues to permeate our day-to-day lives, there has been a significant shift of focus from the hype surrounding it to finding real value in its use.

While understanding the value of big data continues to remain a challenge, other practical challenges including funding and return on investment and skills continue to remain at the forefront for a number of different industries that are adopting big data.

Generally, most organizations have several goals for adopting big data projects. While the primary goal for most organizations is to enhance customer experience, other goals include cost reduction, better targeted marketing and making existing processes more efficient. In recent times, data breaches have also made enhanced security an important goal that big data projects seek to incorporate.

What are some of the Applications of Big Data in the Communications, social media and entertainment industry?

Organizations in this industry simultaneously analyze customer data along with behavioral data to create detailed customer profiles that can be used to:

Create content for different target audiences

Recommend content on demand

Measure content performance

Statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.

Since, classification is essential for the study of any subject, Big Data is widely classified into three main types, which are:

Structured Data - It accounts for about 20% of the total existing data, and it’s used mostly in programming and computer-related activities.

There are two sources of structured data- machines and humans. All the data received from sensors, web logs and financial systems are classified under machine-generated data. These include medical devices, GPS data, data of usage statistics captured by servers and applications and the huge amount of data that usually move through trading platforms, to name a few.

Human-generated structured data mainly includes all the data a human input into a computer, such as his name and other personal details. When a person clicks a link on the Internet, or even makes a move in a game, data is created- this can be used by companies to figure out their customer behaviour and make the appropriate decisions and modifications.

Unstructured Data - The rest of the data created, about 80% of the total account for unstructured big data. While structured data resides in the traditional row-column databases, unstructured data is the opposite- they have no clear format in storage. Most of the data a person encounters belongs to this category- and until recently, there was not much to do to it except storing it or analyzing it manually.

Unstructured data is also classified based on its source, into machine-generated or human-generated. Machine-generated data accounts for all the satellite images, the scientific data from various experiments and radar data captured by various facets of technology.

Human-generated unstructured data is found in abundance across the Internet, since it includes social media data, mobile data and website content. This means that the pictures we upload to our Facebook or Instagram handles, the videos we watch on YouTube and even the text messages we send all contribute to the gigantic heap that is unstructured data.

Semi-structured Data -The line between unstructured data and semi-structured data has always been unclear, since most of the semi-structured data appear to be unstructured at a glance. Information that is not in the traditional database format as structured data, but contain some organizational properties which make it easier to process, are included in semi-structured data. 

For example, NoSQL documents are considered to be semi-structured, since they contain keywords that can be used to process the document easily.

Big Data analysis has been found to have a definite business value, as its analysis and processing can help a company achieve cost reductions and dramatic growth. So it is imperative that you do not wait too long to exploit the potential of this excellent business opportunity.

Data, in today’s business and technology world, is indispensable. The Big Data technologies and initiatives are rising to analyze this data for gaining insights that can help in making strategic decisions. The concept evolved at the beginning of 21st century, and every technology giant is now making use of Big Data technologies. 

The Data analytics field in itself is vast. The field of Big Data and Big Data Analytics is growing day by day.

#buttons=(Accept !) #days=(20)

Our website uses cookies from Google to enhance your experience. Our Privacy Policy
Accept !