Thursday, July 4, 2019

Python Identity Comparison (is / is not) Can Produce Unexpected Results

The purpose of this blog post is to make the reader aware of some of the unexpected results when using the identity comparison (is / is not).

The blog post assumes the reader is familiar with the following:
  1. Identity comparison (is / is not).
  2. Built-in function id().
Let's start with some examples.

Example 1

>>> x = 300.0

>>> y = 300.0

>>> x is y
False

Example 2

>>> x=300.0; y=300.0

>>> x is y
True

>>> id(x)
1833827542696

>>> id(y)
1833827542696

People are often surprised by the output True for "x is y" in example 2. The output is a result of the Python bytecode compiler optimizing identical literals when they are part of the same text.  This is confirmed by both x and y producing the same id() value.

Even the Python documentation warns about the unexpected behavior. It states "Due to automatic garbage-collection, free lists, and the dynamic nature of descriptors, you may notice seemingly unusual behaviour in certain uses of the is operator, like those involving comparisons between instance methods, or constants. Check their documentation for more info."

As long as we are on the subject of Python optimizations, it should be noted that the Python implementation keeps an array of integer objects for all integers between -5 and 256. So, when you create an int in that range you actually just get back a reference to the existing object. The array obviates the need to create an object. This is demonstrated by the following example.

Example 3

>>> a = 1

>>> b = 1

>>> a is b
True

As stated in PEP 8, the use case for the identity comparision operator is to perform comparisons to singletons like None.

In summary, just don't use the is operator unless you are doing a comparision to a singleton like None.