Structured data, unstructured data, Cassandra, Hadoop, petabytes- are you familiar with this big data lingo? Do not worry if you are unfamiliar. This article will serve the purpose of explaining what big data is and how it will be a very important factor for businesses across the globe.
Big data has three primary characteristics. The first characteristic, to no surprise, is the sheer size and volume of data. A huge amount of data is being generated every second. Twitter alone generates more than seven terabytes of data every day. One terabyte equals 1,000 gigabytes. That means that petabytes of data are being produced across the globe annually. A petabyte is 1,000,000 gigabytes. To put that in perspective, an MP3 file runs at 1 megabyte per minute, a petabyte of MP3s could play continuously for over 2,000 years.
Data variety is another characteristic of big data. There are two key types of data. Unstructured data, like social media posts, emails and videos, and structured data, such as spreadsheets with clearly defined columns and rows of data. Big data analytic platforms are needed to normalize this data so unstructured data can now be compared to structured data.
Velocity is the speed at which data is being received from all data feeds. For instance over 200,000 tweets are sent out every minute. Pair this with numerous other data feeds and you can see that there would be a tremendous amount of data coming in very quickly.
One of the biggest challenges with Big Data is not on how to gather it, but rather how to highlight the pertinent data and gain insight. This is where a dynamic platform such as C2M can help.