Ideas & Talks
StockWatcher
Idea
The StockWatcher is, as its name suggests, a monitoring system for stock exchanges. The StockWatcher displays each company that is listed on the NASDAQ (or NYSE) as a circle in a two-dimensional grid. The positions of the circles (companies) in the grid capture the similarity among the companies and are computed via the use of a Self-Organizing Map (also known as Kohonen Network). This means that for example companies in the lower left corner are similar to each other and companies in the top right corner are also similar to each other. They form clusters of similarity. The similarity is computed based on the textual descriptions of the companies which are fed to the Self-Organizing Map. The Self-Organizing Map outputs the above described clustering.\r\n
The StockWatcher renders a circle red if its stock price decreased (monotonically) over the last N (description. To get an overview of the last trading day one can specify N=1, set the radio button to “both” and hit “search”. The circles itself are hyperlinks that, when clicked on, forward the user to the google finance page of an individual stock.
Motivation
The reason why I developed this system was a bet with a friend of mine. We set up virtual accounts and started each with 10.000$ in play-money to “buy” stocks. After a certain time we would meet again for a beer and compare our investment efforts. The worse investor would have to pay for the drinks.
After having to pay the tab three times in a row I wondered how I could get an overview of the market to aid my investment efforts – like a bird’s-eye view. Well, I came up with the StockWatcher and it helped me to identify the “suckers” on the market and become a more efficient day trader. The idea for going after the “suckers” I got from the book The Black Swan by Nassim Taleb.
The 4th time my friend and I met to compare our investment efforts I finally could enjoy a couple of free drinks. Thing is: I don’t know if I was lucky, and the stars were with me this time or if the StockWatcher really helped me to win.
Another motivation to create the StockWatcher was to observe the reaction of the markets to big unforeseeable events (like the Arab Spring or an earthquake). The idea was that the StockWatcher would show a cluster of red circles in the days after the event. That’s actually the reason why I used a Self-Organizing Map to layout (cluster) the companies in a two-dimensional grid.
Technologies Used
- R (programming language and software environment for statistical computing)
- Java Servlets (technology to answer http requests),
- Lucene (search engine) and
- Processing (javascript graphics library)
Please feel free to use the system.
Any feedback would be greatly appreciated (johannes [dot] liegl [at] gmail [dot] com)
Topic Classification in R
A Tutorial on Using Text Mining and Machine Learning Technologies to Classify Documents
The talk starts with a basic introduction to classification before it reveals details about Support Vector Machine classification. Large Margin Hyperplanes and the Kernel Trick are discussed. The latter is also accompanied by a nice video. Then Topic Classification is introduced as well as the Reuters 21578 dataset. The rest of the talk deals with reproducing the results Joachims achieved in his 1998 paper. Document Term Matrices are explained and different weighting schemes for them. In the second part of the presentation some R-functions/objects of the tm-package and svm-package are explained. This part of the presentation is accompanied by R-source-code that tries to reproduce Joachim’s finding. The Talk concludes with a short description of available evaluation measures (Recall, Precision, Confusion Matrix) and the presentation of the results achieved with the R-based SVM topic model.
- R Source Code
- Presentation pptx pdf
Topic Classification in R – A Tutorial on Using Text Mining and Machine Learning Technologies to Classify Documents von Johannes Liegl steht unter einer Creative Commons Namensnennung 3.0 Unported Lizenz.
