Directory Image
This website uses cookies to improve user experience. By using our website you consent to all cookies in accordance with our Privacy Policy.

Key concepts of Big Data

Author: Mansoor Ahmed
by Mansoor Ahmed
Posted: Nov 05, 2020

What is Big Data?

  • An arena of Big Data gives means and methods to examine, logically extract information and deal with large & complex data sets by traditional data-processing application software.

  • Big data analytics means to systematically extract & analyze facts from large volumes of data that is too large to be analyzed manually by human beings using pen & paper.

  • US computer scientist and entrepreneur John R.Mashey popularized the term Big Data in 1990.

Key Concept

  • Big data was originally linked with three key concepts: volume, variety, and velocity.

  • Currently, the following 10-Vs are most popularly associated with Big Data.

  1. Velocity: Speed at which data is being generated and transferred to the destination.

  2. Volume: Quantity of collected and stored data.

  3. Variety : Different forms of data structured and unstructured

  4. Variability: Dynamic evolving behavior in Data source

  5. Value: Business value derived using data

  6. Veracity: Quality or trustworthiness of the data

  7. Validity: Correctness or accuracy of the data used to extract result in the form of information.

  8. Virality: Rate at which the data is spread by a user and received by different users.

  9. Volatility: Duration of usefulness of data

  10. Visualization: Representation of data to trigger a decision.

  • American businessman, software engineer and Google CEO (2001-2011) explain the Data Volume In The Era Of Data Centers as "There were 5 exabytes of information created from the dawn of civilization till 2003.Now in the era of data centers, big data & digital technologies – 5 exabytes information is created every 2 days. "

Big Data Analytics Broad Description

Big data analytics includes capturing data, data storage, data analysis, search, visualization, querying & updating data & using AI software to do automatic analysis.

Analysis of big data sets is used to find correlations, historical trends, find unusual data anomalies & use this information to take corrective actions.

Big data is now gathered using Industry 4.0 technologies including IOT sensors, Smart Phones, aerial drones (remote sensing), cameras, radio-frequency identification (RFID) sensors and wireless sensor networks.

A Key Difference from Industry 3.0 Viz-A-Vis Real Time Data Gathering

  • In Industry 3.0 data was gathered by cables. In Industry 4.0 data has totally changed including the following.

  • Data gathering by cables sensors as well as Wi-Fi sensors

  • Data can be sent in real time to any data center in the world at light speed

  • This data can be analyzed using big data analytics algorithms in a data center

  • Real time diagnostics by AI software to do self-correction or suggest 2-3 options to the engineers for solving the problem

Applications

  • Big data has increased the demand of data management specialists such a lot in order that Software AG, Oracle Corporation, IBM, Microsoft, SAP, EMC, HP and Dell have spent quite $15 billion on software firms specializing in data management and analytics.

  • In 2010, this industry was worth quite $100 billion and was growing at almost 10 percent a year: about twice as fast because the software business as an entire. Developed economies increasingly use data-intensive technologies. There are 4.6 billion mobile-phone subscriptions worldwide, and between 1 billion and a couple of billion people accessing the web.

  • Between 1990 and 2005, quite 1 billion people worldwide entered the center class, which suggests more people became more literate, which successively led to information growth. The world's effective capacity to exchange information through telecommunication networks was 281 petabytes in 1986, 471 petabytes in 1993, 2.2 exabytes in 2000, 65 exabytes in 2007 and predictions put the quantity of internet traffic at 667 exabytes annually by 2014.

  • Consistent with one estimate, one-third of the globally stored information is within the sort of alphanumeric text and still image data, which is that the format most useful for many big data applications. This also shows the potential of yet unused data (i.e. Within the sort of video and audio content).

  • While many vendors offer off-the-shelf solutions for giant data, experts recommend the event of in-house solutions custom-tailored to unravel the corporate’s problem at hand if the company has sufficient technical capabilities.

How Data storage started?

  • Data storage started with vinyl records storing songs in 1880. Vinyl record could store 200 MB of data – but could not rewrite the data & was for one time use only.

  • First magnetic tape data storage came in 1947. The tape could store 60 MB of data – data could be rewritten multiple times on the same magnetic tape.

  • First hard disk drive came in 1957 named IBM 305 RAMAC. It could store 4 MB data and weighed 900 K.G. The hard drive could write & rewrite data in real time.

  • First solid state disk drive came in 1991 with 20 MB data storage capacity. This drive had no moving parts & had only electronic circuits to write & rewrite data endless times.

  • Today biggest available hard disk drive capacity is 16,000 GB & biggest solid state disk drive available is 8,000 GB capacity. Their size is 3.5 inch * 2 inch & weight is only 500 grams (Half K.G)

  • The capacity of 20 MB in 1991 versus 16,000 GB in 2020 - is only 0.12 % of data storage capacity.

  • Due to such massive data storage capacities available today – high speed internet, YouTube videos, Industry 4.0, AI & machine learning technologies have become possible.

Google Data Centres

Google has such 15 data centers globally with following features;

  • Each data center uses 200 MW electrical powers.

  • Each data center covers 500 acres of covered buildings

  • Huge installation of air conditioning for cooling as servers produce a lot of heat

  • For air conditioning – installation of chillers, cooling towers, heat exchangers, water pumps, RO plants – all these equipment is connected to Google’s own machine learning systems to optimize the use of all these utilities

  • UPS systems of 20-50 MW for back up electrical power.

Facebook & big data Technology?

  • Facebook uses the most advanced technologies viz-a-vis Machine Learning, AI, Advance Software Algorithms & Industry 4.0 Technologies.

  • Facebook has 12 data centers globally which are as big as Google data centers.

  • Facebook machine learning software operates at light speed (terabits/second) & "learns" the back ground of every user in the world including his age, profession, likes he does on Facebook, city of living, types of friends he has, types of pages he is following.

  • Facebook even "knows" from which city each user is using Facebook and using big data analytics knows which person stays in which area and which person travels to foreign countries.

  • Within a time line of 3 months Facebook AI software has learned the basic habits of every person including likes & dislikes, types & back ground of friends, pages each person follows, chats & subjects of chats of each person.

  • Based on this massive data – Facebook big data analytics & machine learning software gives tailor made individualized suggestions designed for each individual separately – including types of friends, new pages & products.

  • Such technology is impossible to deploy by using manual sheets of paper, telephone lines, what's app, SMS, excel worksheets.

About the Author

Mansoor Ahmed Chemical Engineer,Web developer

Rate this Article
Leave a Comment
Author Thumbnail
I Agree:
Comment 
Pictures
Author: Mansoor Ahmed

Mansoor Ahmed

Member since: Oct 10, 2020
Published articles: 124

Related Articles