Do some Excel Work 3

Closed Posted Mar 14, 2015 Paid on delivery
Closed Paid on delivery

I have a lot of data and I need to do:

Data set available from Moodle (data originate from UCI repository)

a) Summarise the data

What is the dimensionality of the data? What are the min, median, max, mean, standard

deviation and percentage missing data of each feature?

b) Impute missing values

Use replacement by mean and replacement by median to fill in missing values. Display the min, median, max, mean and standard deviation for the data with imputations. Justify which imputation method is more suitable.

c) Visualise/transform the data

Use the data as transformed from part 1 (mean centered, median imputed). a) Cluster the data

Apply your choice of clustering algorithm (out of k-means, FarthestFirst, HierarchalClusterer, EM) to create 10 clusters and explain the results. Justify your choice of clustering technique. Compare the cluster results to the Class1 attribute and calculate the accuracy. Include screen shots of the clustering options and the clustering results.

b) Apply PCA to reduce features

Implement principle component analysis to reduce the number of features. Justify a suitable choice for the number of principle components to use. Implement the same clustering technique as used in a) after PCA and calculate the clustering accuracy. Include screen shots of the PCA options, the PCA results and the clustering results.

c) Conclusions

Comment on the difference in performance (accuracy) between the clustering in parts a) and b) and explain why this occurred.

the data as transformed from part 1 (mean centered, median imputed) Train the classifiers using 2/3 of the data from step 1 and test the classifiers by applying them to the remaining 1/3 of the data from step 1. In this part you will be predicting the Class2 feature of the data (binary classification CTY or non-CYT) using the first 8 features (mcg-nuc).

a) Classification

Try using the following 5 classification algorithms: Naive Bayes, k-NN (k=5 and k=10), logistic regression and C4.5 Decision tree algorithms. What are the algorithms accuracies on the test data? Explain the results.

b) Ensembles

Create a stacker ensemble: Use the output for each of the previous classifiers as features into a new classifier of your choice (this may require changing your train/test split). Illustrate what is being done and give an example of how it works. How does the performance compare with each single classifier?

c) Conclusions

What are the potential issues/limitations with stacking?

if you can do this so bid there otherwise don't waste my time please.

Data Processing Excel

Project ID: #7306603

About the project

13 proposals Remote project Active Apr 20, 2015

13 freelancers are bidding on average $124 for this job

rajafaizan

Hello sir... I provide very fast work & guarantee you 100% satisfaction and my expertise lies in quality service and delivery on Time kindly if you wants high quality work then as soon as possible contact me. Please gi More

$222 AUD in 3 days
(11 Reviews)
3.2
citijayamala

I am an expert in Excel and an experienced MBA finance I can assist on all the points mentioned on your project description. Jayamala from India Hire Me

$155 AUD in 3 days
(4 Reviews)
2.1
qu3ntin0

A proposal has not yet been provided

$155 AUD in 3 days
(0 Reviews)
0.0
saimapervaiz1987

A proposal has not yet been provided

$88 AUD in 3 days
(0 Reviews)
0.0
miree63

A proposal has not yet been provided

$133 AUD in 3 days
(0 Reviews)
0.0
mala41

helo sir i have a 5 year experience and i m doing job in innova database last 4 year i have a many experience about excel i am already complete project of excel i can do this easly

$111 AUD in 3 days
(0 Reviews)
0.0
rahulnegi004

I am Computer Science Engineer and certified by Microsoft Office Specialist...will give you best output...contact ne for more details

$56 AUD in 1 day
(0 Reviews)
0.0
mamundaimond

Dear sir, I have sufficient skill that match your project also have more than 6 years experience in the fields of MS word, excel, Data processing tasks. I am committed to cover your project in expected time frame with More

$66 AUD in 3 days
(0 Reviews)
0.0
daudul

A proposal has not yet been provided

$155 AUD in 3 days
(0 Reviews)
0.0
aarwani

A proposal has not yet been provided

$155 AUD in 3 days
(0 Reviews)
0.0
ihebunandu

We are experts in copy typing, banner creation, word processing, web search, case study, photoshop,power point, HTML ,data processing, eBook design, logo designs, proof-reading, world translation, sending traffic to si More

$155 AUD in 3 days
(0 Reviews)
0.0
ismailsakib

এখনও কোন প্রস্তাব দেওয়া হয়নি

$111 AUD in 3 days
(0 Reviews)
0.0