Our business runs a stock advisory service. Some of our clients trade our recommended stocks at a loss by buying at a high price and selling at a low price. We need some very simple trading models with comparative results to show clients how they can grow the value of their investment accounts with our stock picks.
Data mining - especially the pre-processing, preliminary data analysis, and reporting steps - can be very helpful for addressing your problem. This tip examines six stocks, such as those that might be recommended by your firm, to present the outcome of three different trading strategies. The stock trading strategies are purposely simple so that they can be readily programmed with T-SQL as well as understood by your stock trading clients. All strategies covered are consistent with showing clients how they can grow the value of their investment accounts.
The source data for the tip are stock prices from the Google Finance site. These data are available for free. The tip illustrates how to download stock prices to csv files. Then, the data are transferred to SQL Server tables.
After transferring the stock price data to a SQL Server database, they are pre-processed to make them suitable for examining how each of the three stock trading strategies perform. One stock trading strategy depends on moving averages for stock prices. This tip includes a simple explanation of moving averages as well as an easy way to compute them for stock price data.
The final tip element compares gains and losses from the three the trading strategies. The comparisons are for the six stocks individually and overall. Comparisons are computed on a per share basis as well share lot basis. A share lot is a set of shares for a stock that are bought and sold as a unit.
Downloading the data from the Google Finance site
You can download historical end-of-day price and volume for a stock from the Google Finance site. Simply enter a URL with parameters, including the stock's symbol as well as the start and end dates for the range of data that you seek. You also need to specify the output format. Your browser, such as Chrome or Internet Explorer, will retrieve the daily stock prices and volumes to a file on your computer.
The following URL specifies the retrieval of historical price and volume data for a stock with the symbol crus. The data starts as of the first trading day on or after January 1, 2008 through to the last trading day for which data are available up until August 31, 2016. The download was taken during the morning on August 24, 2016. The Google Finance site automatically names the downloaded file with the symbol specified in the URL (for example, crus.csv).
[code]http://www.google.com/finance/historical?q=crus&startdate=Jan+1%2C+2008&enddate=Aug+31%2C+2016&output=csv[/code] Here's a screen shot from Excel showing the first 20 rows of data in the csv file named crus.csv. Notice that there is a separate row for each trading day ending on August 23, 2016, the last trading day for which end-of-day data was available. Aside from the Date column, there are four columns specifying money values (Open, High, Low, Close) and a fifth column shows shares traded on a date for the stock designated by the symbol. While Excel automatically transforms the character data in the crus.csv file to date and numeric values, it is important to keep in mind that the data in the csv file are character data.
Data for six stock symbols were downloaded for this tip. The following bullets show and the symbols along with the corresponding company names and short descriptions. You can see from the descriptions that great diversity is readily available.
crus is for Cirrus Logic, Inc. The company supplies circuits for consumer, automotive and professional audio and energy applications
edu is for New Oriental Education & Tech Grp (ADR). This symbol represents an ADR for New Oriental Education & Technology Group -- a company operating in China that is a provider of private educational services. An ADR is a stock that trades in the United States but represents a specified number of shares in a foreign corporation
meet is for MeetMe, Inc. The company makes available social networks for meeting new people in the US and in Latin America
orly is for O'Reilly Automotive Inc, a retailer of automotive parts and accessories
stmp is for Stamps.com, a provider of internet-based mailing and shipping services
ulta is for Ulta Salon, Cosmetics & Fragrance, Inc. This firm sells cosmetics, fragrances, skin and hair care products, appliances, and accessories. The company also offers hair salon services and spa treatments
Migrating the csv files to SQL Server tables
This tip creates the tables to store the csv file contents in a database named stock_history_data. Then, it populates the SQL Server tables from the downloaded csv files - one table per file. The code provided for the tip will not work as is unless you have the stock_history_data database created on your SQL Server instance. If you do not have the database on your server, running the following script can create the database.
IF EXISTS(select * from sys.databases where name='stock_history_data')
DROP DATABASE stock_history_data
CREATE DATABASE stock_history_data[/code] Because the data in the csv files are character-based, you must transform them before you can use them as dates, money, or integers in SQL Server. There are several ways to perform the transformations. A prior tip working with S&P 500 index historical prices ( Creating an SSRS Matrix Sparkline Report with Data from Oracle 11g ) describes an approach in which the data are imported to SQL Server as character data and then transformed via T-SQL functions to date, money, or integer values. This tip demonstrates how to use built-in SSIS transformation features to convert the data to an appropriate data type.
The following screen shot shows a Control Flow view of the SSIS project for data mining the stock price data downloaded from the Google Finance site. Some annotation text and two steps are highlighted. The highlighted content is for importing the downloaded data into SQL Server.
Additionally, there are 7 connection managers displayed below the control flow area.
One is and OLEDB connection manager pointing at the stock_history_data database
Six are for flat file connection managers that point at the downloaded csv files
The following screen shot shows the T-SQL in the create orly ohlcv table Execute SQL Task within the Create stock history ohlcv tables Sequence Container. There are six Execute SQL Task steps in the container - one for each table to receive downloaded data from a csv file. All tables have the same specification except for the symbol name.