Looking for a similar answer, essay, or assessment help services?

Simply fill out the order form with your paper’s instructions in a few easy steps. This quick process ensures you’ll be matched with an expert writer who
Can meet your papers' specific grading rubric needs. Find the best write my essay assistance for your assignments- Affordable, plagiarism-free, and on time!

Posted: July 27th, 2022

An Exploration of Techniques Used in Data Analytics to Produce Analysed Data in Graphical Format

An exploration of techniques used in Data Analytics to produce analysed data in graphical format.

Contents

Which Citation Styles Can You Handle?

We get a lot of “Can you do MLA or APA?”—and yes, we can! Our writers ace every style—APA, MLA, Turabian, you name it. Tell us your preference, and we’ll format it flawlessly.

1. Project Proposal

1. 1 Introduction

1.2 Background

1.3 Aims

Are Writing Services Legal?

Totally! They’re a legit resource for sample papers to guide your work. Use them to learn structure, boost skills, and ace your grades—ethical and within the rules.

1.4 Objectives

1.5 Intellectual challenge

Why (this project):

What’s the Price for a Paper?

Starts at $10/page for undergrad, up to $21 for pro-level. Deadlines (3 hours to 14 days) and add-ons like VIP support adjust the cost. Discounts kick in at $500+—save more with big orders!

1.6 Research Methods

2. Scope

2 .1 Big Data

2.2 Data –

Is My Privacy Protected?

100%! We encrypt everything—your details stay secret. Papers are custom, original, and yours alone, so no one will ever know you used us.

2.3 Technology –

2.4 Scripting Language

2.5 Python-

3.Plan

Is AI Involved in Writing?

Nope—all human, all the time. Our writers are pros with real degrees, crafting unique papers with expertise AI can’t replicate, checked for originality.

3.1 First Term

3.2 Second Term – Plan

4 Literature Survey

4.1 Overview

Why Are You the Best for Research?

Our writers are degree-holding pros who tackle any topic with skill. We ensure quality with top tools and offer revisions—perfect papers, even under pressure.

4.2 A brief history of Big Data

4.3 Big Data

4.4 Data Analytics

4.4 a Machine Learning:

Who Writes My Assignments?

Experts with degrees—many rocking Master’s or higher—who’ve crushed our rigorous tests in their fields and academic writing. They’re student-savvy pros, ready to nail your essay with precision, blending teamwork with you to match your vision perfectly. Whether it’s a tricky topic or a tight deadline, they’ve got the skills to make it shine.

4.5 Data terms used in Data Analytics

1. Data Mining

2. Data Cleansing

3. Clustering/Cluster analysis

Will My Paper Be Unique?

Guaranteed—100%! We write every piece from scratch—no AI, no copying—just fresh, well-researched work with proper citations, crafted by real experts. You can grab a plagiarism report to see it’s 95%+ original, giving you total peace of mind it’s one-of-a-kind and ready to impress.

4.6 Relationship between Big Data and Data Analytics

4.7  Datasets – Overview

4.7.1 Datasets – File Extensive

4.7.2 Datasets

Can You Use Any Citation Format?

Yep—APA, Chicago, Harvard, MLA, Turabian, you name it! Our writers customize every detail to fit your assignment’s needs, ensuring it meets academic standards down to the last footnote or bibliography entry. They’re pros at making your paper look sharp and compliant, no matter the style guide.

4.8 Techniques use in Data Analytics/Data Science

4.9.1 Programming Languages used in Data Analytics

4.9.2 R – Programming Language:

4.9.3 Python – Programming Language:

Can I Change My Order Details?

For sure—you’re not locked in! Chat with your writer anytime through our handy system to update instructions, tweak the focus, or toss in new specifics, and they’ll adjust on the fly, even if they’re mid-draft. It’s all about keeping your paper exactly how you want it, hassle-free.

4.9.4 Comparing the Languages:

Python libraries:

5. Software Requirements

5.1 Overview of the project software requirements

How Do I Order a Paper?

It’s a breeze—submit your order online with a few clicks, then track progress with drafts as your writer brings it to life. Once it’s ready, download it from your account, review it, and release payment only when you’re totally satisfied—easy, affordable help whenever you need it. Plus, you can reach out to support 24/7 if you’ve got questions along the way!

5.1 Functional Requirements

5.2 Functional Requirements Diagram:

5.2 Non – Functional Requirements

5.3 Non – Functional Requirements

How Quick Can You Write?

Need it fast? We can whip up a top-quality paper in 24 hours—fully researched and polished, no corners cut. Just pick your deadline when you order, and we’ll hustle to make it happen, even for those nail-biting, last-minute turnarounds you didn’t see coming.

6.0 Design

6.1 Diagrams illustrating the Overview of the project:

Analytics Flow:

6.2 Overview of Data flow diagram:

6.3 Data flow diagram:

Can You Handle Tough Topics?

Absolutely—bring it on! Our writers, many with advanced degrees like Master’s or PhDs, thrive on challenges and dive deep into any subject, from obscure history to cutting-edge science. They’ll craft a standout paper with thorough research and clear writing, tailored to wow your professor.

6.4 Use – Case Diagram:

6.4.1 Conclusion:

7.0 Implementation

Introduction

7.1. Software Requirements:

How Do You Match Professor Expectations?

We follow your rubric to a T—structure, evidence, tone. Editors refine it, ensuring it’s polished and ready to impress your prof.

7.2.A Data – Streaming the data:

7.2 A Purpose

Twitter account

Twitter App

How Do You Edit My Work?

Send us your draft and goals—our editors enhance clarity, fix errors, and keep your style. You’ll get a pro-level paper fast.

Installing Tweepy

Creating Python scripts

7.2.B Data – 1 Food balance:

7.2.B Purpose –

Description

Can You Brainstorm Topics?

Yep! We’ll suggest ideas tailored to your field—engaging and manageable. Pick one, and we’ll build it into a killer paper.

7.3.1 Cleaning the preparing the Food Balance Dataset:

Importing the datasets – Food Balance

Cleaning and preparing – Food Balance

7.4.1 – Food Balance dataset – Analysing and Extracting of the Data:

Extracting the data

Extracted data

7.5.1- Food Balance dataset -Visualization of the results

Do You Offer Fast Edits?

Yes! Need a quick fix? Our editors can polish your paper in hours—perfect for tight deadlines and top grades.

Ireland – Matplotib

U.K – Matplotib

Plotly

Comparing U.K to Ireland

7.2.B Data – 2. Diabetes prevalence (% of population ages 20 to 79):

7.2.B Purpose –

Can You Start With an Outline?

Sure! We’ll sketch an outline for your approval first, ensuring the paper’s direction is spot-on before we write.

Description -2. Diabetes prevalence

7.3. 2 Importing the datasets – Diabetes prevalence

Cleaning and preparing – Diabetes prevalence

7.4. 2 Diabetes prevalence

Extracting the data

8.0 Result sections:

8.1.A Data – Streaming the data:

8.1.B.1 Data – Food balance – Ireland output – A

Can You Add Charts or Stats?

Definitely! Our writers can include data analysis or visuals—charts, graphs—making your paper sharp and evidence-rich.

8.1.B.2- 3D plot

8.1.B.3 Data – Food balance – U.K output

8.1.B.4 Data – Food balance – Comparing U.K and Ireland

8.1.B.5  Diabetes for population ages 20 to 79

8.1.B.6  Ire and U.K

8.2 Testing Result

Functional requirements:

Non-Functional

8.2.A Testing of apparatus:

9.0 Evaluation:

What About Multi-Part Projects?

We’ve got it—each section delivered on time, cohesive and high-quality. We’ll manage the whole journey for you.

9.1 Result of the requirements (functional and non-functional):

9.1.A Data – Streaming the data:

9.1.B Data – Food Balance:

9.1.B Data – Diabetes for population ages 20 to 79:

Conclusion:

Do You Adapt to International Rules?

Yes! UK, US, or Aussie standards—we’ll tailor your paper to fit your school’s norms perfectly.

Appendix – Section 1

Demonstrate on how to install the relevant package

Appendix -Section 2- Python scripts used to stream data

A. Python Scripts used to stream sugar tweets:

B. File located in containing the relevant tweets:

What does a complex assignment mean?

If your assignment needs a writer with some niche know-how, we call it complex. For these, we tap into our pool of narrow-field specialists, who charge a bit more than our standard writers. That means we might add up to 20% to your original order price. Subjects like finance, architecture, engineering, IT, chemistry, physics, and a few others fall into this bucket—you’ll see a little note about it under the discipline field when you’re filling out the form. If you pick “Other” as your discipline, our support team will take a look too. If they think it’s tricky, that same 20% bump might apply. We’ll keep you in the loop either way!

C. Output of file in Notepad:

D. Append script to show split tweets only:

E. Output of spilt tweets in Notepad:

F. Script used to produce chart:

Who is my writer? How can I communicate with him/her?

Our writers come from all corners of the globe, and we’re picky about who we bring on board. They’ve passed tough tests in English and their subject areas, and we’ve checked their IDs to confirm they’ve got a master’s or PhD. Plus, we run training sessions on formatting and academic writing to keep their skills sharp. You’ll get to chat with your writer through a handy messenger on your personal order page. We’ll shoot you an email when new messages pop up, but it’s a good idea to swing by your page now and then so you don’t miss anything important from them.

Appendix-Section 3 – Cleaning and preparing the data

A. Script used to apply the describe command and the results from the command:

B. Python scripts used to produce charts before cleaning and preparing the data:

1.B – Charts -Bootstrap:

Whole csv file –  Box Plot:

Box plots using random numbers:

Scatter plot for groups:

Line charts:

C. Python Code – cleaning and preparing the data

Appendix -Section 4 -second dataset -world rise in diabetes:

Appendix -Section 5 –  Scikit-Learn – Demo File

5.1 General Health file

5.2 Output result of Cluster:

5.3 Comparing cluster:

5.4 Result of the comparing cluster:

10.0 Timetable of Study

References

Figures:

Figure 1-Health Ireland Survey 2015 -(Damian.Loscher,2015)

Figure 2- Simple steps to illustrate Big Data (Li,2015)

Figure 3- Process of Big Data (Gandomi and Haider,2015)

Figure 4-Diagram of functional requirements

Figure 5-Functional requirements

Figure 6-Diagram of Functional Requirements

Figure 7-Non- functional requirements

Figure 8-Diagram of non-functional requirements

Figure 9-Steps of Design

Figure 10- Outlined of the project process

Figure 11-Overview of Data-flow diagram

Figure 12-Steps of Data-flow diagram

Figure 13- Use-Case diagram

Figure 14- Outlined details of the Implementation stages

Figure 15- Download the Enthought Canopy

Figure 16-Create a new Twitter account.

Figure 17-Shows the create a twitter app

Figure 18-App details

Figure 19- Overview of the file in notepad.

Figure 20- This shows the number of rows and columns

Figure 21-Importing the Food Balance file into Enthought Canopy.

Figure 22-Shows the result after the dataset was cleaned.

Figure 23-Ire.csv file

Figure 24-Create a Plotly account

Figure 25-Imported csv file in Enthought Canopy

Figure 26- Result of the Python scripts

Figure 27- Output of Ireland result

Figure 28-3D output of Ireland result

Figure 29-UK Sugar result

Figure 30-Comparing both countries

Figure 31-Rise of Diabetes

Figure 32-Ireland and U.K

Figure 33- Functional requirement

Figure 34- Result of non-functional

Figure 35-Displaying the Apparatus

Figure 36-download the package tweepy into Enthought Canopy

Figure 37- first script to stream data

Figure 38- The file which contains the tweets

Figure 39- Output tweets in a notepad

Figure 40- This code spilt the tweet

Figure 41- The output of the split code

Figure 42- Scripts to produce the chart

Figure 43- Command for describe

Figure 44-Resuult of describe

Figure 45-Python script creating charts

Figure 46-Bootstrap Chart before clean

Figure 47- Box plot before clean

Figure 48- Box plots

Figure 49- Scatter plot before clean

Figure 50- Line chart before clean

Figure 51- Command to get the max

Figure 52- Print the head

Figure 53-Result of the head

Figure 54-Print of the tail

Figure 55-Result of the tail

Figure 56-Make a Copy

Figure 57- Show the data types

Figure 58-Prints out the types result

Figure 59-Delete the columns

Figure 60- Result of the deleted columns

Figure 61- Renaming a column

Figure 62- Result of the figure 59

Figure 63- Removing the na

Figure 64- Result of the nas

Figure 65- Alternation of the file

Figure 66- Output result

Figure 67- Number of rows and columns

Figure 68- Output to file.

Figure 69- Output the diabetes

Figure 70- File import to canopy

Figure 71- Print max

Figure 72- Print out the max

Figure 73- Box chart before clean

Figure 74- Chart to show diabetes

Figure 75- Prints the head

Figure 76- Print out the dataframe

Figure 77- Make a copy

Figure 78- Removes the columns

Figure 79- Result the of figure 76

Figure 80- The rows and columns

Figure 81- Print the head

Figure 82- Na is false

Figure 83- Show the false in the result

Figure 84- Remove any

Figure 85- Output of the na

Figure 86- Delete column

Figure 87- Print the result

Figure 88- Save to file

Figure 89- Save in user

Figure 90- k-Means examples

Figure 91- Chart of the general file

Figure 92- Example of Clustering

Figure 93- Result of Clustering

1. Project Proposal

Project Title

Data Analytics – An exploration of techniques used in Data Analytics to produce analysed data in graphical format.

1. 1 Introduction

 

This project, will examine the techniques of Data Analytics used to cleanse, analyse, extract data and produce visual charts.

This project should demonstrate that Python language, can be used as a main element in the process of Data Analytics.

1.2 Background

 

Data Analytics, is the term given to the overall process, that collects, analysing, uses machine learning to developing algorithms which can produce predictable data via technology (Waller and Fawcett, 2013). Data Science has become an essential factor of the industry due to the massive impact of Internet. This is mainly due to the rapid advancements in technology and software, which allows people to gain a working knowledge of Data Science. However, Data Science requires the user to know what information they may need before the process begins.

This project, will achieve the process of Data Analytics, by obtaining datasets, streaming data, using Python scripts to cleanse and prepare the data to visually display the predicted results, in graphical form.  This project will generate data on how sugar has become a major vocal point of discussion in our life’s. Data will be collected to demonstrate, how the amount of raw sugar imported to Ireland and the U.K has increased.

In recent times, the rise in sugar consumption has become a concern, since sugar is not only contained within fruit and vegetables, it is also added to various types of other foods such as cereals, processed foods, and drinks (Waller and Fawcett, 2013).

Increased sugar consumption can have an influence on body weight which can lead to such illnesses as heart disease, diabetes, metabolic syndrome, kidney disease (Johnson et al., 2007).

In Ireland, a proposed tax of 10% on SSB (sugar sweetened beverages) was addressed due to the rise in child-hood obesity in 2011 (Scarborough et al., 2013).  This year in the budget it was confirm that in April 2018 the SSB tax will be added to all sugar products (Pope, 2016).

A report was produced in 2015 called “Healthy Ireland Survey 2015” and this report was based on comparing sugar intake to snack intake in Ireland between Men and Women aged between 15 to 65 years old. A graph from this report is illustrated in Fig 1, this graph does not clearly indicate sugar consumption (Damian. Loscher, 2015).

Figure 1-Health Ireland Survey 2015 -(Damian.Loscher,2015)

1.3 Aims

1. To obtain and examine knowledge on Data Science, in particular the area of Big Data and Data Analytics.

2. To investigate websites which have datasets and explore the various techniques to cleanse, extract and illustrate the data.

 

1.4 Objectives

 

  1. To obtain an overall understanding of Data Science, mainly the area of Big Data and Data Analytics.

2  To explore the various websites, evaluate and locate suitable datasets for this project and skills required for streaming data from large websites. 

3. To examine various types of Tools and Technology that can be used in the process of Data Analytics.

4. To Investigate the different types of scripting languages that can be used in Data Analytics.

5. To examine and understand the working capabilities of Python and the packages that are available within Python.

1.5 Intellectual challenge

 

Overview: The author of this project, has no previous knowledge of Data Science, Big Data, and Data Analytics.  The author hopes to obtain a working knowledge of the overall concept of Data Science and what are the elements that are contained within this area. The author also will examine and explore the terminology of Big Data and the relevant technology used in relation to Data Science. The author will investigate the process of Data Analytics and how the process involves obtaining data to illustrate the predicate result. The author will also develop the essential skills in using Python and produce scripts to extract the necessary requirements.

Why (this project):

The rise in employment in the Data Science sector and various companies requiring the skills of Analytics.

To confirm, Governments are concerned about the rise in sugar consumption and the adverse effect this is having on people’s lives. Previous research is mainly directed towards various types of food not particularly the intake of sugar.

1.6 Research Methods

 

Journals, academic papers, books, and various websites (kdnuggets etc.) will be the main source of existing research on Data Science. Data will be required on diabetes a sugar related disease to show the rise in this illness for people who range from the age of 20 to 79. Data will also be obtained by streaming and locating the relevant datasets. Python will be used to process the Data by means of extracting, cleansing, and clustering the Vital Data. Python and Plotly will be utilized to display the extract data in graphical form.

2. Scope

Extensive research will be out to obtain the relevant information to complete this project.  This project will be completed by gaining an extensive working knowledge of the terminology of Data Science, Big Data, Data Analytics. Streaming and locating datasets, which will be imported into Enthought Canopy and python scripts will illustrated the results of the data in graphical form. For this project to succeed the location of the correct datasets is vital.

The datasets will be collected from the world datasets, U.K and Ireland government websites to examine the importing of raw sugar. Data will also be streamed from Twitter to examine how much people are talking about sugar. To Investigate sugar related diseases such as diabetes and the rise of the illness across the world.

2 .1 Big Data – To obtain an overall understanding of Data Science, mainly the area of Big Data and Data Analytics.

Investigate the main concepts of Data Science and in-depth the understanding of Big Data and Data Analytics. Research the technology and terms used within Data Analytics and Machine Learning and identifying the relationship between Big Data and Data Analytics.

 

2.2 Data -To explore the various websites, evaluate and locate suitable datasets for this project and skills acquired for streaming data from large websites.

Research various websites, journals, and white papers to find the location of suitable data sets. Read the various datasets to establish if the information is suitable regarding this project. Examine ways to streams data from Twitter.  Investigate the file extension of the data sets to ensure that the datasets have not already been cleaned.

 

2.3 Technology – To examine various types of Tools and Technology that can be used in the process of Data Analytics.

Examine the various types of tools and technologies that could be used to clean, extract and display data from the datasets.  For example, can the data be illustrated in excel. Consider all the tools and technologies, then evaluate which tools is to be used in this project.

 

2.4 Scripting Language – Investigate the different types of scripting languages that can be used in Data Analytics.

Research the different types of scripting languages which can be used in Data Analytics.  Compare two programming languages to established which is going to be used in this project.

2.5 Python-To examine and understand the working capabilities of Python (Enthought Canopy).

Register for a course which has a guide tutorial on the Python (Enthought Canopy) these are available on Udemy at beginner’s level.

3.Plan

 

This outlines the time- frame given to each section of the project. The plan offers guidelines for the project but clearly indicates the deadlines that are required to complete the project.

3.1 First Term

Task Name Duration Start Finish Weeks in total
Project Proposal 21 days Mon 19-09-16 Fri 07-10-16 3
   Project Idea – research 11 days Mon 19-09-16 Wed 12-10-16
   Aims 5 days Mon 19-09-16 Fri 23-09-16
   Objectives 5 days Mon 19-09-16 Fri 23-09-16
Project scope & plan 14 days Fri 07-10-16 Fri 21-10-16 2
   Research scope 4 days Fri 07-10-16 Wed 12-10-16
   Outline plan 10 days Wed 12-10-16 Fri 14-10-16
Literature Survey 46 days Mon 26-09-16 Fri 11-11-16 7
   Research Data Analytics 33 days Mon 17-10-16 Mon 07-11-16
   Research python 13 days Tue 01-11-16 Fri 11-11-16
Software Requirements 21 days Mon 31-10-16 Mon 21-11-16 3
   Research software 15 days Mon 31-10-16 Tue 15-11-16
   Software Requirements 6 days Tue 15-11-16 Mon 21-11-16
Design 21 days Mon 21-11-16 Mon 05-12-16 3
   Design research 14 days Mon 21-11-16 Mon 28-11-16
   Design 7 days Mon 28-11-16 Mon 05-12-16
Oral Presentation 14 days Mon 05-12-16 Fri 16-12-16 2
   Presentation Research 14 days Mon 05-12-16 Fri 16-12-16

Table 1- Table showing plan for first term

 

3.2 Second Term – Plan

 

Task Name Duration Start Finish Weeks in total
Software Development 39 days Mon 23-01-17 Fri 03-03-17 6
 Software Development – research 24 days Mon 23-01-17 Wed 15-02-17
   Learning Software 7 days Thu 16-02-17 Thu 23-02-17
   producing software 6 days Fri 24-02-17 Fri 03-03-17
Interim Presentation 11 days Mon 06-03-17 Fri 17-03-17 2
   Interim Presentation 5 days Mon 06-03-17 Fri 10-03-17
   Presentation 6 days Mon 13-03-17 Fri 17-03-17
Draft final Report 34 days Mon 20-03-17 Fri 21-04-17 5
   Draft report 14 days Mon 20-03-17 Mon 03-04-17
   Report 17 days Tue 04-04-17 Fri 21-04-17
Final Report 11 days Mon 03-04-17 Fri 14-04-17 2
   Create report 11 days Mon 03-04-17 Fri 14-04-17
Oral Presentation 11 days Mon 17-04-17 Fri 28-04-17 2
   Presentation preparation 11 days Mon 17-04-17 Fri 28-04-17

Table 2- Table showing the plan for term 2

4 Literature Survey

 

4.1 Overview

 

Technology has become such a significant part of our daily lives with the rise in hand held devices which range from the phone to the tablet. All this technology means that information or data is available and people want to access this data as they need it. Imagine all the data or information, from all the sources around the world, floating about with no specific relevance.  That data could be extremely relevant, for all various types of industries, if the data was extracted correctly and analysed, the data could be used as an essential marketing tool.

 

4.2 A brief history of Big Data

 

Questioning information or data, to obtain an answer has been going on for centuries. The ability to question data on a larger scale came around the 1960s, when computers started to appear on the market. The ability to compute data, opened doors for major companies, to obtain information or data on what customers want (Hurwitz et al., 2013).

In the 1970s the data model and the RDBMS created a structure for data accuracy, this meant that the data could be abstracted, cleaned, and queried to generate reports for various types of industry to research the extracted data.  Over the last number of years, these methods have advanced, to what we know as Big Data. Data modelling has transported the way companies gather, extract, clean and use data as a major marketing tool to gain information on their customers’ requirements (Hurwitz et al., 2013).

4.3 Big Data

Around the late 1990s the term “Big Data”, was launched at the Silicon Graphics Inc although it did not become a massive buzz word until 2011 (Diebold, 2012).

Big Data can be defined as a term, used to described the huge datasets, which consist of both structured and non-structured data.  These data sets can be very complex, however with techniques and various types of tools, this can enable the collecting, storage, cleansing, extract of the data to be analysed. The analysed data can offer great benefits to various types of industry (Sagiroglu and Sinanc, 2013).

There is a massive market for companies for all types of industries to know what people want.  For example, the television company might what to know what types of programs people like to watch?   This means the company could stream the data from a live feed such as Facebook or twitter.  As the internet, has grown people are now communicating at a fast rate with large volumes of data being produced.  Big Data consists of may attributes, which is known as three Vs – Volume, Variety, and Velocity (Russom, 2011).

These attributes can be described in detailed below in the table:

Name Description
Volume Volume in relation to Big Data means, the size of data which can range from terabytes to petabytes. A terabyte can store the same amount of data equal to the storage of 1500 CDS. The volume is the main attribute of Big Data because the size of the data sets can be massive (Gandomi and Haider, 2015).
Variety Variety is the structural context of the dataset. This means that the dataset can be constructed with various types of data, from structure to non-structured data.  Structure data is a data, that is structured correctly and requires no cleansing methods.  Non-structured data is data which may contain inconsistent, incorrect, or missing data within the datasets.  Datasets can have both types of data.  There are various types of software available to cleanse the data and this can amend any missing or inconsistent data within the datasets (Gandomi and Haider, 2015).
Velocity The speed or the frequency in which the data can be generated is the velocity. Collecting the Big Data is not necessarily done all the time in real-time, it can also be collected via streaming for example, streaming live feed in Twitter.  Therefore, the data can be obtained as quickly as possible. (Gandomi and Haider, 2015).

Big Data

Big Data means that larger datasets(Volume) which consist of various types of data (Variety) can be collected at a fast pace (Velocity). There are also additional dimensions of Big Data which are Veracity, Variability and Value (Gandomi and Haider, 2015).  In Fig 2, the illustration is of six simple steps to complete Big Data successfully (Gandomi and Haider, 2015).

Figure 2- Simple steps to illustrate Big Data (Li,2015)

The term Big Data refers to the data, the type, size and rate however the data has no relevance until the data goes through a process called Data Analytics.

4.4 Data Analytics

 

Analytics is using tools and techniques, to analyse the data and extract any relevant data from the datasets and streaming data. Data Analytics is a term which is used to describe the techniques used to examine and transform the data from datasets and streaming data into relevant information which can be used to predict certain future trends.

The data can be used to produce reports from querying the data, it offers a prediction or interpretation of the data. For example, a dataset is located, on most popular cars bought over the last five years.  When the dataset is checked for inconsistencies like missing data or incorrect data and then the data is cleaned. The cleaned data can be displayed in a bar chart or graph to visual the cars display bought over the last five years. Basically, Data Analytics turns the cleansed data into actionable data or information (Hilborn and Leo, 2013).  There are various types of analytics, text analytics, audio analytics (LVCSR systems and phonetic-based systems), video analytics, social media analytics, social influence analysis and predictive analytics (Gandomi and Haider, 2015).

4.4 a Machine Learning:

 

Machine learning is the element of Data Science, that computes the algorithms effectively to construct the data models (Mitchell, 2002).    Machine learning is an artificial intelligence that allows the computer to compute, by not having to be explicitly programmed. Machine learning allows the development of programs that can expanded and change when the new data is added. Machine learning has the intelligence to predict the patterns in the data and alter the program accordingly (Meng et al., 2016).

Machine learning algorithms are categorized into three different types which is supervised, non- supervised and semi-supervised.

Supervised can be described as the input and output variables which use algorithms to be mapped from the input to output accordingly. Supervised can be sub–divided into two sections: classification and regression. Unsupervised algorithms have an input variable with no corresponding output variables and can be also sub-divided into two categories: clustering and association. Semi-supervised is where data is considered to between the supervised and non-supervised (Brownlee, 2016).  This project, will use the supervised machine learning algorithms along with unsupervised such as clustering.

4.5 Data terms used in Data Analytics

1. Data Mining

Knowledge discovery in databases (KDD) is another name given to data mining.  Data mining is a more in-depth method of analyzing data from different dimensions (Ryan S J D Baker, 2010).

2. Data Cleansing

Data cleansing is a term given to the cleaning of data within datasets or huge amounts of data. When data is, collect or recorded, there is always an area of error or inconsistency with massive amounts of data. To cleanse the data, each data entry must be checked for missing or incorrect data entries. This can take a long time to achieve but there are software programs available to speed up the process of cleaning the data (Maletic and Marcus, 2000).

3. Clustering/Cluster analysis

This method involves gathering data of the same cluster/group together into one cluster.  Basically, it is the grouping of a cluster of a similar task into a group. The groups or cluster are observed as a cluster and analysed as a cluster (Ketchen and Shook, 1996).

4.6 Relationship between Big Data and Data Analytics

All relationship has a bond, the data is the connecting bond between Big Data and Data Analytics.  Although Data Analytics would not be possible without Big Data, as Big Data is the first stage in the process of Data Science.  Big Data, or more importantly the data sets are not relevant until the data is processed or analysed. The analytics side of the relationship turns the data into useful or important data that can predicate future trends. With the correct techniques and tools this relationship can produce extremely productive information.

In Fig 3 is illustration of the process between Big Data and Data Analytics (Gandomi and Haider, 2015).

Figure 3- Process of Big Data (Gandomi and Haider,2015)

4.7  Datasets – Overview

Datasets are sets of data which consist of both structured and non-structured data. Several Government Departments, Public Administration or live feeds from Twitter can create these datasets (Ermilov et al., 2013).

4.7.1 Datasets – File Extensive

The data display in datasets is displayed in tabular form and is saved as CSV file or Excel extension.  Although datasets saved in excel are normally cleansed data and are ready to display the result in visual format. CSV files can store extremely larger amount of data and the data must be cleansed before analysing. (Ermilov et al., 2013).

 

4.7.2 Datasets

Two methods will collect the data sets

1. Streaming data from a Twitter.

2.Collecting datasets from Global datasets, U.K and the Ireland.

4.8 Techniques use in Data Analytics/Data Science

Big Data and Data Analytics are elements of Data Science. To implement these elements, the assumption that an extensive knowledge of programming a language acted as a deterrent for people to understand the process. Data Science requires a considerable number of algorithms to produce the visual output of predictable information (Witten and Frank, 2005).

Thus, a lot of companies have invested a vast amount of time and money to produce software platforms where people can obtain knowledge of the Data Science by following simple steps. These software package have easy to follow GUI interface for the user to gain knowledge of Data Science with ease and confidence (Witten and Frank, 2005).

Below is a brief outline of some software packages that can help people to develop Big Data skills with little or no coding skills.

Techniques Overview Open Source

Tags: Custom Homework Writing Help & Assignment Answers Service, Do My Homework Fast App, Help with all Assignment shark essay writing service, Homework assignments custom writings affordable services

Order|Paper Discounts

Why Choose Essay Bishops?

You Want The Best Grades and That’s What We Deliver

Top Essay Writers

Our top essay writers are handpicked for their degree qualification, talent and freelance know-how. Each one brings deep expertise in their chosen subjects and a solid track record in academic writing.

Affordable Prices

We offer the lowest possible pricing for each research paper while still providing the best writers;no compromise on quality. Our costs are fair and reasonable to college students compared to other custom writing services.

100% Plagiarism-Free

You’ll never get a paper from us with plagiarism or that robotic AI feel. We carefully research, write, cite and check every final draft before sending it your way.