Computer Project (20%): Using Excel for Statistical Analysis

EE 3341: Probability and Statistics

Professor Duncan L. MacFarlane

Assigned 10/17/2005 Due 11/28/2005 (last day of class)

 

Introduction

The purpose of this project is to provide you with practice in using Excel to perform an array of typical statistical calculations on sets of random data.

The data that will be analyzed is stock prices. While we don’t care about these data from an engineering perspective, stock prices provide price movements that are sufficiently random and are easily downloadable. (Let’s face it though, when you get hired you’ll find that most of your co-workers check stock prices daily.)

Therefore, the project in a nutshell is to:

1.      Download history of stock prices for two companies

2.      Put them into Excel

3.      Use this data to practice statistical analysis and probability calculations.

Download stock prices:

You will need to pick two stock symbols. Please pick these independently, I recommend picking two that begin with the same letters as your initials. On the web, go to a site like finance.yahoo.com or www.nasdaq.com to find and enter a symbol. Click on stock chart, then get a 5 year chart. Download the last 1025 closing prices by clicking on the chart. At both the yahoo and the Nasdaq sites there is an option to download directly into an excel file; this is quite useful.

Excel

You may either download the stock prices directly in excel format, or into a text file, which may then be opened by excel (to convert the data).

Make sure that your excel data analysis tools have been activated. These should be found under the “tools” heading. If not, click on “add-ins …” also under the “tools heading.”

For each stock, you will have a worksheet with a column of dates, daily closing prices and volumes of shares traded.

Calculations

For one of the stocks:

1.      In a column next to the original data, calculate the change in stock price from the previous day. This may be negative or positive. From 1025 closing prices you will get 1024 changes in stock price.

2.      In a column next to the change in stock price, calculate the percentage change in stock price from the previous day. Plot a histogram of this column for all 1024 days. Comment on the shape of this graph.

3.      At the bottom of each column, list the: mean, standard error, median, mode, standard deviation, sample variance, kurtosis, skewness, range, minimum, maximum, sum, count, largest(1) smallest(1) and confidence level. Provide a brief definition or description of each of these. (hint: you may use the “descriptive statistics” tool).

4.      Normalize the frequency of occurrence data for the percentage change in stock price in order to obtain a probability mass function (pmf).

5.      Calculate histograms of the percentage change in stock price PMF, and fit these histograms with a Gaussian for: (hint: calcuate the mean and the variance for the data sets and put these into the formula on page 242 of the text). (hint2: there is a correction to consider in accounting for the Gaussian (a continuous distribution) and the excel data (a discrete distribution).

a.       8 arbitrarily selected values

b.      16 arbitrarily selected values

c.       32 arbitrarily selected values

d.      128 arbitrarily selected values

e.       256 arbitrarily selected values

f.        all 1024 values

g.       Calculate the sum of the square of the difference between the data and the fit for each case and plot this sum as a function of the number of value. Comment on the quality of the fit as N increases.

6.      Using the PMF for 1024 daily percentage changes, answer the following questions:

a.       If you invest $1000 what is the probability that you will make $5 in one day?

b.      If you invest $1000 what is the probability that you will lose money in one day?

c.       If you invest $1000 what is the probability that you will make more than $50 in 20 days?

7.      Given a day where every stock on Wall Street rises, what is the probability that your $1000 investment late the day before is worth:

a.       $1005

b.      $1100 or more.

8.      Now weight (multiply the two data sets together) the percentage change in stock price by the volume of shares traded that day.

a.       Calculate the histogram of this derived random variable.

b.      List the: mean, standard error, median, mode, standard deviation, sample variance, kurtosis, skewness, range, minimum, maximum, sum, count, largest(1) smallest(1) and confidence level.

c.       What is the probability of an increase in stock price given the 10 busiest days?

9.      EXTRA CREDIT: Take a Fourier analysis of the 1024 values of the closing price. Plot the magnitude of the results. Comment on this graph.

For the pair of stocks:

1.      Calculate the correlation.

2.      Calculate the covariance.

3.      Comment on these results.

4.      You invest $500 in each of these stocks. After 1 day,

a.       What is your total expected return?

b.      What is your total expected variance on the return?

5.      You invest $500 in each of these stocks. After 10 days,

a.       What is your expected return?

b.      What is your expected variance on the return?

Deliverables

Please email to me (dlm@utdallas.edu) an excel spreadsheet containing your project, including all comments and discussion. Alternatively, you may hand in a properly labeled CD during class on Monday, November 28, 2005.

I will grade for:

1.      Completeness.

2.      Clarity of work. (make sure I can follow what calculations you performcd)

3.      Interpretations.

4.      Format. (A neat professional workbook with properly labeled tabs, etc that is easy to grade will be looked well upon).

Electronic submission is a fact and convenience of modern life. I fully expect this to be a reliable mechanism. You, however, are ultimately responsible for making sure I receive the file, and can open it. I will email you promptly if I receive the file and can not open it, and we can work diligently to resolve any situation. To this end, handing in the project early is welcome. Undue problems may affect the format grade. Importantly, please check your computer for viruses before submission. A computer virus will result in a 0 for the project grade.