Friday, August 7, 2015

Sunday, June 7, 2015

Quantifying the Efficacy of Machine Learning Algorithms

There are many articles and books and talks and so on all making claims about machine learning algorithms. Out of those, how many actually QUANTIFY THE EFFICACY OF THE ALGORITHM?.

Below are some references which will hopefully save people some leg work and help others quantify the performance of their algorithms.
  1. Articles
  2. Books
    1. Assessing and Improving Prediction and Classification by Timothy Masters
    2. Evaluating and Comparing the Performance of Machine Learning Algorithms by Melanie Mitchell
    3. Evaluating Learning Algorithms: A Classification Perspective by Japkowicz, Shah
      1. Amazon
        1. Customer Reviews
          1. Howard B. Bandy on July 7, 2014
            1. ... examples presented are all of stationary data ...
        2. Cambridge.Org
          1. Looking for an examination copy?
            1. If you are interested in the title for your course we can consider offering an examination copy. To register your interest please contact collegesales@cambridge.org providing details of the course you are teaching.
          2. Google
            1. Has table of contents and you can look at some randomly selected pages
            2. MohakShah.Com
              1. Email Address: eval@mohakshah.com
                1. For electronic editions, ...
                  1. Computing Resources
              2. Evaluation and Analysis of Supervised Learning Algorithms and Classifiers by Niklas Lavesson
                1. Machine Learning and Data Mining: 14 Evaluation and Credibility by Pier Luca Lanzi

              Sunday, May 24, 2015

              Python Books / Videos on Algorithms and Math

              Below is a list of references on Python that are related to algorithms or mathematics. The hope is to save others some leg work. For a complete list of Python books and videos, click here.
              1. Algorithms & Data Structures in Python by S Jagannathan, N Sinenian
              2. Annotated Algorithms in Python by M Di Pierro
              3. Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference by C Davidson-Pilon
              4. Building Machine Learning Systems with Python by W Richert, L Pedro Coelho
              5. Building Probabilistic Graphical Models with Python by K R Karkera
              6. Computational Physics by M Newman
              7. Data Structure and Algorithmic Thinking with Python by N Karumanchi
              8. Data Structures and Algorithms Using Python by R D Necaise
              9. Data Structures and Algorithms: Using Python and C++ by D M Reed, J Zelle
              10. Data Structures and Algorithms in Python by M T Goodrich, R Tamassia, ...
              11. Data Structures and Algorithms with Python by K D Lee, S Hubbard
              12. Doing Math with Python by Amit Saha
              13. Equilibrium Statistical Physics: with Computer simulations in Python by Leonard M. Sander
              14. Image Processing and Acquisition using Python by R Chityala, S Pudipeddi
              15. Introduction to Machine Learning with Python by S Guido
              16. Introduction to Numerical Programming: A Practical Guide for Scientists and Engineers Using Python and C/C++ by T A Beu
              17. Learning scikit-learn: Machine Learning in Python by R Garreta, G Moncecchi
              18. Machine Learning in Python: Essential Techniques for Predictive Analysis by M Bowles
              19. Marketing Data Science: Modeling Techniques in Predictive Analytics with R and Python by T W Miller
              20. Mathematics and Python Programming by J C Bautista
              21. Mathematics for the Digital Age and Programming in Python by M Litvin, G Litvin
              22. Modeling Techniques in Predictive Analytics with Python and R by T W Miller
              23. Numerical Methods in Engineering with Python by J Kiusalaas
              24. OpenCV Computer Vision with Python by J Howse
              25. Parallel Programming with Python by J Palach
              26. Primer on Scientific Programming with Python by H P Langtangen
              27. Problem Solving with Algorithms and Data Structures Using Python by B N Miller, D L Ranum
                1. Edition: 2
              28. Programming and Mathematical Thinking: A Gentle Introduction to Discrete Math Featuring Python by A M Stavely
              29. Programming Computer Vision with Python: Tools and algorithms for analyzing images by J E Solem
              30. Python Algorithms: Mastering Basic Algorithms in the Python Language by M L Hetland
              31. Python for Signal Processing: Featuring IPython Notebooks by J Unpingco
              32. Python for Scientists by John M. Stewart
              33. Python Scripting for Computational Science by H P Langtangen
              34. Scientific Computation: Python Hacking for for Math Junkies by B E Shapiro
              35. Statistics, Data Mining, and Machine Learning in Astronomy by Z Ivezic, A Connolly, ...
              36. Think DSP - Digital Signal Processing in Python by A B Downey

              Saturday, May 16, 2015

              Regular Expressions

              Below is a list of references related to regular expressions. The hope is to save others some leg work.
              1. Books
                1. Automate the Boring Stuff with Python - Practical Programming for Total Beginners by Albert Sweigart
                  1. O'Reilly
                    1. Part II. Automating Tasks
                      1. Chapter 6: Pattern Matching with Regular Expressions
                  2. Beginning Regular Expressions by Andrew Watt
                  3. Introducing Regular Expressions - Unraveling regular expressions, step - by - step by Michael Fitzgerald
                  4. Mastering Python Regular Expressions by Felix Lopez, Victor Romero
                  5. Mastering Regular Expressions - Understand Your Data and Be More Productive by Jeffrey E.F. Friedl
                    1. Edition: 3
                  6. Oracle Regular Expressions Pocket Reference by Jonathan Gennick, Peter Linsley
                  7. Regular Expression Pocket Reference - Regular Expressions for Perl, Ruby, PHP, Python, C, Java and .NET by Tony Stubblebine
                    1. Edition: 2
                  8. Regular Expressions Cookbook - Detailed Solutions in Eight Programming Languages by Jan Goyvaerts, Steven Levithan
                    1. Edition: 1
                    2. Edition: 2
                2. Videos
                  1. Learning Regular Expressions by Mike McMillan
                3. RegExr.Com (GitHub.Com): An online, real-time regular expression sandbox tool that lets you visually tweak, undo, redo, save, and share directly in the browser by Grant Skinner (Twitter.Com).

                Saturday, May 9, 2015

                Newsvendor Problem / Newsboy Problem

                Below is a list of references related to newsvendor problem / newsboy problem. The hope is to save others some leg work.
                1. Books
                  1. Building Intuition - Insights From Basic Operations Management Models and Principles Editors: Dilip Chhajed, Timothy J. Lowe (2008)
                    1. Chapter 7: The Newsvendor Problem by Evan L. Porteus
                      1. Springer.Com
                      2. Handbook of Newsvendor Problems - Models, Extensions and Applications - Editors: Choi, Tsan-Ming (Jason) (Ed.) (2012)
                      3. Perishable Inventory Systems - Authors: Nahmias, Steven (2011)
                        1. The book’s ten chapters first cover the preliminaries of periodic review versus continuous review and look at a one-period newsvendor perishable inventory model.
                          1. Springer.Com
                          2. Inventory Management and Production Planning and Scheduling (1998)
                          3. Management Science: An Introduction to the Use of Decision Models by Kenneth R. Baker, Dean H. Kropp (1985)
                          4. Principles of Sequencing and Scheduling by Kenneth R. Baker, Dan Trietsch
                        2. Videos
                          1. Newsvendor Problem 1 by Piyush Shah (2014)
                        3. Articles
                          1. Analysis of the multi-product newsboy problem with a budget constraint by Layek L. Abdel-Malek, Roberto Montanari (2005)
                          2. Benchmark solution for the risk-averse newsvendor problem by Baruch Keren, Joseph S. Pliskin - European Journal of Operational Research (2006)
                          3. Binary solution method for the multi-product newsboy problem with budget constraint by Bin Zhang, Xiaoyan Xu, Zhongsheng Hua (2009)
                          4. Capacitated newsboy problem with random yield: The Gardener Problem by Layek Abdel-Malek, Roberto Montanari, Diego Meneghetti (2008)
                          5. Channel coordination in supply chains with agents having mean-variance objectives by Tsan-Ming Choi, Duan Li, Houmin Yan, Chun-Hung Chiu (2008)
                          6. Competitive multiple-product newsboy problem with partial product substitution by Di Huang, Hong Zhou, Qiu-Hong Zhao (2011)
                          7. Comprehensive Analysis of the Newsvendor - Model with Unreliable Supply by Yacine Rekik, Evren Sahin, Yves Dallery (2009)
                          8. Distribution-free newsboy problem with resalable returns by Julien Mostard, René de Koster, Ruud Teunterc (2005)
                          9. Distribution-free newsboy problem: Extensions to the shortage penalty case by Hesham K. Alfares, Hassan H. Elmorra (2005)
                          10. Encyclopedia of Operations Research and Management Science
                          11. Exact, approximate, and generic iterative models for the multi-product Newsboy problem with budget constraint by Layek Abdel-Malek, Roberto Montanari, Libia Cristina Morales (2004)
                          12. Extended newsboy problem with shortage-level constraints by M.S. Chen, C. C. Chuang (2000)
                          13. Fuzzy models for the newsboy problem by Dobrila Petrović, Radivoj Petrović, b, Mirko Vujošević (1996)
                          14. Fuzzy multi-product constraint newsboy problem by Zhen Shao, Xiaoyu Ji (2006)
                          15. Fuzzy newsvendor approach to supply chain coordination by Kwangyeol Ryu, Enver Yücesan (2010)
                          16. IE324 Simulation course - Newsvendor Problem by Kagan Gokbayrak
                            1. To go to the newsvendor spreadsheet, click here
                            2. Impact of loss aversion on the newsvendor game with product substitution by Wei Liu, Shiji Song, Cheng Wu (2013)
                            3. Loss-averse newsvendor game by Charles X. Wang - International Journal of Production Economics (2010)
                            4. Loss-averse newsvendor problem by Charles X. Wang, Scott Webster (2009)
                            5. Mean-Variance Newsvendor Model with a Background Risk by Jiang-feng Li, Qiong Wu (2015)
                              1. Springer.Com
                                1. This paper examines the effects of an additive background risk on the optimal order quantity of a risk-averse newsvendor with Mean-Variance utility.
                              2. Mean–variance analysis of the newsvendor model with stockout cost by Jun Wu, Jian Li, Shouyang Wang, T.C.E. Cheng (2009)
                              3. Model and algorithm for bilevel newsboy problem with fuzzy demands and discounts by Xiaoyu Ji, Zhen Shao (2006)
                              4. Multi-period newsboy problem by Keisuke Matsuyama - European Journal of Operational Research (2006)
                              5. Multi-product budget-constrained acquisition and pricing with uncertain demand and supplier quantity discounts by Jianmai Shi, Guoqing Zhang (2010)
                                1. ScienceDirect.Com
                                  1. We consider the joint acquisition and pricing problem where the retailer sells multiple products with uncertain demands and the suppliers provide all unit quantity discounts. The problem is to determine the optimal acquisition quantities and selling prices so as to maximize the retailer’s expected profit, subject to a budget constraint. This is the first extension to consider supplier discounts in the constrained multi-product newsvendor pricing problem. We establish a mixed integer nonlinear programming (MINLP) model to formulate the problem, and develop a Lagrangian-based solution approach. Computational results for the test problems involving up to thousand products are reported, which show that the proposed approach can obtain high quality solutions in a very short time.
                                2. Multi-product constrained newsboy problem with progressive multiple discounts by Moutaz Khouja, Abraham Mehrez (1996)
                                3. Multi-product multi-constraint newsboy problem: Applications, formulation and solution by Hon-Shiang Lau, Amy Hing-Ling Lau (1995)
                                4. Multi-product newsboy problem with limited capacity and outsourcing by Bin Zhang, Shaofu Du (2010)
                                5. Multi-product newsboy problem with supplier quantity discounts and a budget constraint by Guoqing Zhang (2010)
                                6. Multi-product newsboy problem with two constraints by Layek L. Abdel-Malek, Roberto Montanari (2005)
                                7. Multi-stage newsboy problem: A dynamic model - Konstantin Kogan, Sheldon Lou (2003)
                                8. Multiple-item budget-constraint newsboy problem with a reservation policy by Liang-Hsuan Chen, Ying-Che Chen (2010)
                                9. Newsboy problem under progressive multiple discounts by Moutaz Khouja - European Journal of Operational Research (1995)
                                10. Newsboy problem with a simple reservation arrangement by Liang-Hsuan Chen, Ying-Che Chen (2009)
                                11. Newsboy problem with reactive production by Chia-Shin Chung, James Flynn (2001)
                                12. Newsboy problem with resalable returns: A single period model and case study by Julien Mostard, Ruud Teunter (2006)
                                13. Newsstand problem: A capacitated multiple-product single-period inventory problem by Hon-Shiang Lau, Amy Hing-Ling Lau (1996)
                                14. Newsvendor problem: Review and directions for future research by Yan Qin, Ruoxuan Wang, Asoo J. Vakharia, Yuwen Chen, Michelle M.H. Seref (2011)
                                15. Newsvendor Problems by Hayriye Ayhan, Jim Dai, Joe Wu (2003)
                                16. Newsvendor solutions via conditional value-at-risk minimization by Jun-ya Gotoh, , Yuichi Takano (2007)
                                17. Optimal feeding buffers for projects or batch supply chains by an exact generalization of the newsvendor model by Trietsch, Dan (2006)
                                18. Portfolio approach to multi-product newsboy problem with budget constraint by Bin Zhang, Zhongsheng Hua (2010)
                                19. Quadratic programming approach to the multi-product newsvendor problem with side constraints by Layek L. Abdel-Malek, Nathapol Areeratchakul (2007)
                                20. Reordering strategies for a newsboy - type product by Hon-Shiang Lau, Amy Hing-Ling Lau (1997)
                                21. Robust multi-item newsboy models with a budget constraint by George L Vairaktarakis (2000)
                                22. Simple formulas for the expected costs in the newsboy problem: An educational note by Hon-Shiang Lau (1997)
                                23. Single-item newsboy problem with dual performance measures and quantity discounts by Chen-Sin Lin, Dennis E. Kroll (1997)
                                24. Single-period (news-vendor) problem: literature review and suggestions for future research by Moutaz Khouja (1999)
                                25. Supply chain coordination with manufacturer's limited reserve capacity: An extended newsboy problem by Jianli Li, Liwen Liu (2008)
                                26. Two-item newsboy problem with substitutability by Moutaz Khouja, Abraham Mehrez, Gad Rabinowitz (1996)
                                27. Using separable programming to solve the multi-product multiple ex-ante constraint newsvendor problem and extensions by Julie A. Niederhoff (2007)
                                28. Wikipedia: Newsvendor model
                                  1. Would a risk-averse newsvendor order less at a higher selling price? by Charles X. Wang, Scott Webster, Nallan C. Suresha (2009)

                                Monday, January 12, 2015

                                Mathematical Optimization Under Uncertainty

                                As part of self education on mathematical optimization, I have been working my way through the book Optimization Modeling with Spreadsheets, Second Edition by Kenneth R. Baker (Amazon). I really like the book because it focuses on building models for optimization. Also, it does not bog the reader down with programming languages or fancy solvers. Instead, it uses spreadsheets and their built in solver, both of which are readily available and familiar to everyone. The implementation of the models in a spreadsheet ensures that the concepts are understood. It is too easy to fall into, I read it and therefore OF COURSE I understood it.

                                As I am working my way through the book, I came across "just an appendix" titled "Stochastic Programming". Stochastic Programming is techno bable for mathematical optimization under uncertaintiy. It was like wow, you can actually quantifiably perform optimization taking into account the probability of various events happening. After all, in the real world, there are scenarios and associated probabilities. There is no, x is going to definitely happen. Yah, I know, insert joke about death and taxes.

                                Let's actually walk through the examples in the appendix. The manufacturer has to build two types of refrigerators: standard and deluxe. The goal is optmize profit given the manufacturing constraints. In scenario 1 there is a known demand of 80 units for the standard model and at least 25 units for the deluxe model. After the optimization is performed, 80 standard models along with 70 deluxe models are produced to yield a profit of $6,100. In scenario 2 there is a known demand of 104 units for the standard model and at least 25 units for the deluxe model. After the optimization is performed, 104 standard models along with 44 deluxe models are produced to yield a profit of $6,520. There is a similar situation for the third scenario.

                                Now, lets take a step back and assume that scenario 1 has a probability of 0.2 and scenario 2 has a probability of 0.5. By default, scenario 3 has to have a probability of 0.3. When the optimization is performed taking into account these probabilities, a profit of $6,360 is generated. Notice that this profit is not that much different than if you had guessed at what the original scenario was going to be.

                                You can even get more fancy and add extra resources once you actually know the scenario. In this case, the optimization tells you to first produce 98 units of the standard model and 52 units of the deluxe model. When you determine the actual scenario, add extra resources and
                                1. If scenario 1 happens, produce an additional 0 standard models and 15 deluxe models.
                                  1. If scenario 2 happens, produce an additional 6 standard models and 12 deluxe models.
                                    1. If scenario 3 happens, produce an additional 16 standard models and 0 deluxe models.
                                      The above optimization produces a profit of $6,994. This is slightly greater than the other profits because extra resources were added.

                                      A lot of details have been elided. If you are interested in digging in
                                      1. For a table summary of the above click here.
                                        1. If you want the spreadsheet that was used to generate the above numbers, click here.
                                          1. If you want the appendix from which the material came, click here.
                                            In summary, you can perform mathematically quantifiable optimization under uncertainity. All you need to say is that scenario 1 has probability x, scenario 2 has probability y, and scenario 3 has probability z. Create the model in a spreadsheet and than use its built in solver. If you want to get fancy you can have a model that says
                                            1. Start by doing a.
                                              1. When you determine which scenario is actually happening, the model will say do b if scenario 1 happens or do c if scenario 2 happens.
                                                For a more mathematically rigorous approach, articles like "Brief Intro to Stochastic Programming (and Financial Modeling Applications) by Hercules Vladimirou" can be found.

                                                If there is enough interest, perhaps we should start a Google group on Stochastic Optimization? It is realized that certain things will want to be kept confidential by individuals but I am sure that we can share lists of interesting artices or books. So, if you are interested in creating a Google user group, please say so by creating a comment on this blog post.

                                                Also, to save people some search time, below are links to some groups and books that are related to stochastic optimization.

                                                To get a list of stochastic optimization groups, click here.

                                                To get a list of stochastic optimization books, click here.

                                                Depending on the browser, you might have to "download" the file after you click "here" in order to actually see a web page.