Introduction to BIG DATA

'Big Data' is a data but huge in size. It is also described as a collection of data that is huge in size and yet growing exponentially with time. In short, these data's are so large and complex that none of the traditional data management tools will be able to store it or process it efficiently.

Characteristics Of Big Data:

(i)Volume – The name 'Big Data' itself is related to a size which is enormous. Size of data plays a very crucial role in determining value out of data. Also, whether a particular data can actually be considered as a Big Data or not, is dependent upon the volume of data. Hence, 'Volume' is one characteristic which needs to be considered while dealing with 'Big Data'.

(ii)Variety – The next aspect of 'Big Data' is its variety.

Variety refers to heterogeneous sources and the nature of data, both structured and unstructured. During earlier days, spreadsheets and databases were the only sources of data considered by most of the applications. Nowadays, data in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. is also being considered in the analysis applications. This variety of unstructured data poses certain issues for storage, mining and analyzing data.

(iii)Velocity – The term 'velocity' refers to the speed of generation of data. How fast the data is generated and processed to meet the demands, determines the real potential in the data.

Big Data Velocity deals with the speed at which data flows in from sources like business processes, application logs, networks and social media sites, sensors, Mobile devices, etc. The flow of data is massive and continuous.

(iv)Variability – This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively.

Big data' could be found in three forms:

Structured
Unstructured
Semi-structured

Structured

Any data that can be stored, accessed and processed in the form of a fixed format is termed as a 'structured' data. Over the period of time, talent in computer science has achieved greater success in developing techniques for working with such kind of data (where the format is well known in advance) and also deriving value out of it. However, nowadays, we are foreseeing issues when the size of such data grows to a huge extent, typical sizes are being in the rage of multiple zettabytes.

Examples Of Structured Data

An 'Employee' table in a database is an example of Structured Data

Employee_ID     Employee_Name       Gender       Department            Salary_In_lacs
2002                       Rahul                   Male             Admin                   420000
2003                   Sheena john             Female         Admin                    650000
2004                      Akhil R                  Male             Admin                    500000

Unstructured

Any data with unknown form or the structure is classified as unstructured data. In addition to the size being huge, un-structured data poses multiple challenges in terms of its processing for deriving value out of it. A typical example of unstructured data is a heterogeneous data source containing a combination of simple text files, images, videos etc. Nowadays organizations have wealth of data available for them but unfortunately, they don't know how to derive value out of it since this data is in its raw form or unstructured format.

Examples Of Un-structured Data

Output returned by 'Google Search'

Semi-structured

Semi-structured data can contain both the forms of data. We can see semi-structured data as a structure in form but it is actually not defined with e.g. a table definition in relational DBMS. Example of semi-structured data is a data represented in the XML file.

Examples Of Semi-structured Data

Personal data stored in an XML file-

<rec><name>rahul</name><sex>Male</sex><age>25</age></rec>
<rec><name>Sheena john</name><sex>Female</sex><age>32</age></rec>
<rec><name>Akhil r</name><sex>Male</sex><age>21</age></rec>

Benefits of Big Data Processing

Businesses can utilize outside intelligence while taking decisions
Improved customer service
Early identification of risk to the product/services
Better operational efficiency

Support

Characteristics Of Big Data:

Structured

Unstructured

Semi-structured

Benefits of Big Data Processing

Most Popular Articles

Introduction to BIG DATA

Support

Characteristics Of Big Data:

Structured

Unstructured

Semi-structured

Benefits of Big Data Processing

Most Popular Articles

Generate Password