Friday, August 22, 2014

Generate All Permutations - Heap Algorithm

A common problem that shows up often is to generate all the possible permutations for a set of things. Here permutation is being used in the strict mathematical sense. For example, all the permutations of [ 1, 2, 3 ] are

  1. [ 1, 2, 3 ]
  2. [ 1, 3, 2 ]
  3. [ 2, 1, 3 ]
  4. [ 2, 3, 1 ]
  5. [ 3, 1, 2 ]
  6. [ 3, 2, 1 ]

Notice that the above results in 6 possibilities. This corresponds to 3 x 2 x 1 = 3! = 6. The "!" is mathematical notation for factorial. If you are interested in a general discussion about combinations and permutations and factorials, check out Combinations and Permutations by MathIsFun.Com. For those who want a strict definition of the problem we are going to work on permutations without repetition where we choose all the entries.

If you were in a hurry and the number of entities that had to consider were small and you weren't familiar w/ various algorithms, nested loops could be used to generate all the permutations. However, the nested loop approach does not scale because you don't want to change the code by adding / removing loops each time change the number of entities that have to consider.

Luckly, we can avoid nested loops by using the Heap algorithm. The original paper Permutations by Interchanges by B. R. Heap (\[Oxford Journals - The Computer Journal (1963) 6 (3): 293-298. doi: 10.1093/comjnl/6.3.293)\] can be freely obtained by clicking here. Later on, another article on the topic was Permutation Generation Methods by Robert Sedgewick - 1977 - ACM. Unfortunately, to obtain this article, you are going to have to pay for it. Luckly, Robert Sedgewick posted material on Heap's algorithm at Princeton. To obtain that material, click here.

The above "official" material can be supplemented by the following material found on the web

  1. Stack Overflow: Algorithm to generate all possible permutations of a list?
  2. Wikipedia: Heap's Algorithm

Python Books on Algorithms

I haven't been able to find a list of Python books on algorithms and so here I am creating my own. Hopefully, it will help others

  1. Algorithms by Robert Sedgewick, Kevin Wayne
  2. Annotated Algorithms in Python by Dr Massimo Di Pierro
  3. Annotated Algorithms in Python: with Applications in Physics, Biology, and Finance by Dr Massimo Di Pierro
  4. Data Structures and Algorithms in Python by Michael T. Goodrich, Roberto Tamassia and Michael H. Goldwasser
  5. Data Structures and Algorithms Using Python by Rance D. Necaise
  6. Data Structures and Algorithms: Using Python and C++ by David M. Reed and John Zelle
  7. Image Analysis, Classification and Change Detection in Remote Sensing: With Algorithms for ENVI/IDL and Python, Third Edition by Morton J. Canty
  8. Problem Solving with Algorithms and Data Structures Using Python by Bradley N. Miller and David L. Ranum
  9. Problem Solving with Algorithms and Data Structures Using Python by Bradley N. Miller and David L. Ranum
  10. Problem Solving with Algorithms and Data Structures Using Python SECOND EDITION by Bradley N. Miller, David L. Ranum
  11. Programming Computer Vision with Python: Tools and algorithms for analyzing images by Jan Erik Solem
  12. Python Algorithms: Mastering Basic Algorithms in the Python Language by Magnus Lie Hetland

Monday, August 11, 2014

Deep Learning Mathematical Stack (Updated)

Deep learning seems to be a hot topic these days but I haven't been able to find a reference which states the mathematical stack needed to understand it. Below is my attempt at creating one

  1. Markov Network Models
  2. Ising Models
  3. Variational Methods
  4. Hopfield Networks
  5. Boltzmann Machines
  6. Restricted Boltzmann Machines
  7. Deep Learning

Markov Network Models fall under the category of Probabilistic Graphical Models (PGM).

Below are reference

  1. Books
    1. Information Theory, Inference and Learning Algorithms by David J. C. MacKay
    2. Introduction to the Math of Neural Networks Kindle Edition by Jeff Heaton
  2. Videos
    1. Information Theory of Deep Learning. Naftali Tishby - Aug 3, 2017
    2. Mathematics for Deep Learning - Marc Deisenroth - Sep 13, 2017
    3. Tutorial : Mathematics of Deep Learning - Part 1 - ComputerVisionFoundation Videos - Aug 23, 2017



Probabilistic Graphical Models (PGM) - Topic List

On my last blog post, I forgot to include links to the PGM topic list files. I have now included them below.

To go to the linear alphabetic list of PGM topics, click here.

To go to the hierarchical list of PGM topics, click here.

Probabilistic Graphical Models (PGM)

Recently, I discussed probabilistic graphical Models (PGM) with several people. Everyone has a link here and a link there. So, I decided to put the references in one place. See list below. As expected, people will disagree about what should or should not be included. This is a good thing. So comment away.

Similarly, people will disagree as to what topics belong under the PGM umbrella. Furthermore, some people will want to format the topics in an alphabetic linear list while others will want a hierarchy. Attached to this blog are pdf files that reflect my attempt at both. Disagreemnt is a good thing. So comment away.

By the way, I am at best a novice at this type of stuff.
  1. Books
    1. Advanced Data Analysis from an Elementary Point of View by Cosma Rohilla Shalizi
      1. Algorithms for Graphical Models by James Cussens
        1. To download the book, click here.
        2. Artificial Intelligence: A Modern Approach 2e, by S. Russell & P. Norvig, Prentice-Hall, 2003
          1. Bayesian Artificial Intelligence, Second Edition by Kevin B. Korb, Ann E. Nicholson
          2. Bayesian Networks and Decision Graphs by Thomas Dyhre Nielsen, FINN VERNER JENSEN
          3. Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis by Uffe B. Kjærulff, Anders L. Madsen
          4. Bayesian Networks for Probabilistic Inference and Decision Analysis in Forensic Science by Franco Taroni, Alex Biedermann, Silvia Bozza
          5. Bayesian Networks in R: with Applications in Systems Biology (Use R!) by Radhakrishnan Nagarajan, Marco Scutari
          6. Bayesian Networks: A Practical Guide to Applications by Olivier Pourret, Patrick Naïm, Bruce Marcot
          7. Bayesian Networks: With Examples in R by Marco Scutari, Jean-Baptiste Denis
          8. Bayesian Reasoning and Machine Learning (BRML) by David Barber
            1. Author's Web Page for the Book
            2. Amazon
              1. ... only criticism (a mild one) is that, when applying Barber's examples to Bodies of Knowledge and data mining, he skips Prolog, backward chaining, predicate calculus and other techniques that are the foundation of automated inference systems (systems that extend knowledge bases automatically by checking whether new propositions can be inferred from the KB as consistent, relevant, etc.) ...
              2. Cambridge.Org
                1. Git Hub
                2. Building Probabilistic Graphical Models with Python by Kiran R Karkera
                  1. CERN Document Server
                    1. Amazon
                      1. ... All is illustrated with examples and code snippets using mostly the libpgm package. PyMC is used for Bayesian parameter estimation ...
                      2. PacktPub.com
                      3. Causality: Models, Reasoning and Inference by Judea Pearl
                      4. Elements of Statistical Learning by Trevor Hastie
                      5. Graphical Models by Steffen L. Lauritzen
                      6. Graphical Models: Foundations of Neural Computation by Michael I. Jordan, Terrence J. Sejnowski
                      7. Graphical Models: Representations for Learning, Reasoning and Data Mining by Christian Borgelt, Matthias Steinbrecher
                      8. Graphical Models for Machine Learning and Digital Communication by Brendan J. Frey
                      9. Graphical Models in Applied Multivariate Statistics Paperback by Joe Whittaker
                      10. Graphical Models with R (Use R!) by Søren Højsgaard, David Edwards, Steffen Lauritzen
                      11. Graphical Models, Exponential Families, and Variational Inference by Martin J Wainwright, Michael I Jordan
                      12. Inference in Hidden Markov Models by Olivier Cappé, Eric Moulines, Tobias Ryden
                      13. Introduction to Graphical Modelling by David Edwards
                      14. Introduction to Statistical Learning: with Applications in R by Gareth James, Daniela Witten, Trevor Hastie
                      15. Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction by Bradley Efron
                      16. Learning Bayesian Networks by Richard E. Neapolitan
                      17. Learning in Graphical Models by Michael I. Jordan
                      18. Machine Learning: a Probabilistic Perspective by Kevin Patrick Murphy
                      19. Mind's Arrows: Bayes Nets and Graphical Causal Models in Psychology by Clark Glymour
                      20. Modeling and Reasoning with Bayesian Networks by Adnan Darwiche
                      21. Pattern Recognition & Machine Learning, by C.M. Bishop, Springer, 2007.
                        1. Probabilistic Graphical Models: Principles and Techniques by Daphne Koller, Nir Friedman
                          1. Amazon
                            1. ... Her course for probabilistic models is available online, and watching the videos alongside the book really helps sometimes ...
                              1. ... authors painstakingly provided necessary background materials from both probability theory and graph theory ...
                              2. Associated Coursera Course
                                1. Google
                                  1. mitpress.mit.edu
                                    1. Stanford.Edu
                                    2. Probabilistic Networks and Expert Systems - Exact Computational Methods for Bayesian Networks by Cowell, R.G., Dawid, P., Lauritzen, S.L., Spiegelhalter, D.J.
                                    3. Probabilistic Programming & Bayesian Methods for Hackers = Using Python and PyMC
                                      1. Probability on Graphs: Random Processes on Graphs and Lattices by Geoffrey Grimmett
                                    4. Classes
                                    5. Web Pages
                                      1. BAYES NET BY EXAMPLE USING PYTHON AND KHAN ACADEMY DATA by DeRandomized.com
                                        1. Bayes Net Toolbox for Matlab
                                        2. Brief Introduction to Graphical Models and Bayesian Networks by Kevin Murphy
                                          1. Building Probabilistic Graphical Models with Python by Steve Lott
                                            1. Carmen: An open source project for probabilistic graphical models by Manuel Arias and Francisco J. D´ıez
                                              1. Comparison between igraph and networkx by Google Group: NetworkX Discussion
                                                1. Data Structures and Algorithms with Object-Oriented Design Patterns in Python by Bruno R. Preiss - Graphs and Graph Algorithms
                                                  1. Daft: Python package that uses matplotlib to render pixel-perfect probabilistic graphical models
                                                    1. ECE 175B: Prob. Reasoning & Graphical Models by Ken Kreutz-Delgado, Spring 2014
                                                      1. FACTORIE
                                                        1. FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala.
                                                          1. FACTORIE aims to provide a full-featured framework for probabilistic graphical models (both directed and undirected)
                                                            1. SciKit-Learn provides an extensive set of tools for classification, regression, clustering, dimensionality reduction and model selection. It is more mature than FACTORIE, and it is easily integratable with matplotlib, a graphics package, so it can produce high-quality graphs directly. However, it does not support graphical models. SciKit-Learn is implemented in Python; in many cases it is less efficient than FACTORIE, which is JVM-based.
                                                            2. gPy
                                                              1. graph-tool: Graph-tool is an efficient Python module for manipulation and statistical analysis of graphs (a.k.a. networks). Contrary to most other python modules with similar functionality, the core data structures and algorithms are implemented in C++, making extensive use of template metaprogramming, based heavily on the Boost Graph Library. This confers it a level of performance that is comparable (both in memory usage and computation time) to that of a pure C/C++ library.
                                                                1. igraph: The network analysis package igraph is a collection of network analysis tools with the emphasis on efficiency, portability and ease of use. igraph is open source and free.
                                                                  1. Implementing a Principal Component Analysis (PCA) in Python step by step by Sebastian Raschka on April 13, 2014
                                                                    1. Daft: Python package that uses matplotlib to render pixel-perfect probabilistic graphical models
                                                                      1. NetworkX: NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
                                                                        1. Modular toolkit for Data Processing (MDP): Python data processing framework
                                                                          1. Pebl
                                                                          2. pgmpy
                                                                          3. Probabilistic Modeling Toolkit for Matlab/Octave
                                                                          4. Probabilistic Programming & Bayesian Methods for Hackers by Cam Davidson-Pilon
                                                                          5. PyDot: Python interface to Graphviz's Dot language. Pydot allows to easily create both directed and non directed graphs from Python.
                                                                            1. pymc
                                                                            2. PyNFG: PyNFG is designed to make it easy for researchers to model strategic environments using the Network Form Game (NFG) formalism developed by David Wolpert with contributions from Ritchie Lee, James Bono and others. The main idea of the NFG framework is to translate a strategic environment into the language of probabilistic graphical models.
                                                                              1. python-graph: Library for working with graphs in Python. This software provides a suitable data structure for representing graphs and a whole set of important algorithms.
                                                                                1. scikit-learn
                                                                                  1. Software by Kevin Murphy and students
                                                                                    1. Software Packages for Graphical Models by Kevin Murphy
                                                                                      1. Two Node Directed Graphical Model Example by Jascha Sohl-Dickstein
                                                                                        1. Understanding your data w/ Bayesian networks (in python) by Bartek Wilczunski (bartek@mimuw.edu.pl)
                                                                                        2. Wikipedia: Bayesian Network
                                                                                          1. Wikipedia: Graphical Model