SPL Iterators against the performance

loop

This topic’s stayed in my mind for a while. Inspired by Joshua Thijssen’s presentation from PHP UK about (re)discovering the SPL in PHP, I decided to investigate this more carefully. I have to admit that took me some time to understand how the things work and how to not misunderstood the purposes of each iterator and because of lack of documentation it wasn’t that easy. I did a couple of mistakes and probably I will do more, but as Joshua said in his presentation:

The documentation of SPL is completely useless. What can we do? Blog about it!

So, brave yourself. Here my blogpost comes!

Today, I’d like to focus only on one part of documentation – the iterators, because it’s quite attractive how you can solve problems with using them. The main advantage to use the iterators in PHP is an easy way to achieve reusability, single responsibility and nice looking architecture with design patterns. I’d like to show you how quick some of common problems can be solved by using iterators and then, check if that does any impact on the performance.

Design Patterns

Connecting iterators – Decorator

The decorator pattern allows to add a behaviour to individual object, without affecting behaviour of other objects from the same class. As an example we can consider the price calculation for the billing in e-commerce shop where you need to add some taxes, ship price or duty to the final price.

Untitled Diagram

We can split our requirements and create a single responsibility classes for each prize calculation. The third party object will decide which calculations should be included to the final price. The code will look like this:

1
2
3
4
5
6
7
8
$price = new RegularPrice(120);
$price = new TaxPrice($price);
$price = new ShipmentPrice($price);
if ($dutyRequired) {
    $price = new DutyPrice($price);
}

echo $price->getPrice();

The previous object is passed as an argument to the next one. When the getPrice method is called then this should call the same method from the passed object. In the result, we have a price calculated by all the objects.

The iterators work the same way in out case. We create a main object which is called ArrayIterator and the add a custom Filters, Iterators, Parsers etc.

1
2
3
4
$arrayIterator = new ArrayIterator(['apple', 'orange', 'beans', 'pumpkin', 'onion']);
$arrayIterator = new RandomizeArrayIterator($arrayIterator);
$arrayIterator = new FruitIterator($arrayIterator);
$arrayIterator = new LimitIterator($arrayIterator, 0, 3);

FilterIterator

As you can see, it’s not so hard to deliver objects responsible only for one thing. One of the most useful iterator is FilterIterator in my opinion which is extended by FruitIterator in the example. The code might look like:

1
2
3
4
5
6
7
8
9
class FruitIterator extends FilterIterator
{
    protected $acceptedFruits = ['apple', 'orange'];

    public function accept()
    {
        return in_array($this->getInnerIterator()->current(), $this->acceptedFruits);
    }
}

All necessary things there are placed in accept method which returns a bool value determines if the record can be attached or not. On the other hand, we do not need to implement a new class every time. We can use the CallbackFilterIterator and just attach the closure.

1
2
3
$fuits = new CallbackFilterIterator($arrayIterator, function ($current, $key, $iterator) use ($acceptedFruits) {
    return in_array($current, $acceptedFruits)
});

Aggregate Pattern

An Aggregate pattern can refer to concepts in either statistics or computer programming. Both uses deal with considering a large case as composed of smaller, simpler, pieces. This rather refers to an object such as a list, vector, or generator which provides an interface for creating iterators.

So, let’s create a class Salad which can contain ingredients (from previous case). A standard code will look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Salad
{
    protected $ingredients;

    public function __construct(array $ingredients)
    {
        $this->ingredients = $ingredients;
    }

    public function getIngredients()
    {
        return $this->ingredients;
    }
}

$greekSalad = new Salad(['Romaine lettuce', 'Onion', 'Black Olives', 'Green bell pepper', 'Red bell pepper', 'Tomatoes', 'Cucumber', 'Feta cheese', 'Olive Oil', 'Oregano', 'Lemon']);

foreach ($greekSalad->getIngredients() as $ingredient) {
    echo $ingredient;
}

But, we can use a standard component from SPL library as IteratorAggregate and our code will change a little bit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<?php
class Salad implements IteratorAggregate
{
    protected $ingredients;

    public function __construct(array $ingredients)
    {
        $this->ingredients = $ingredients;
    }

    function getIterator()
    {
        return new ArrayIterator($this->ingredients);
    }
}

$greekSalad = new Salad(['Romaine lettuce', 'Onion', 'Black Olives', 'Green bell pepper', 'Red bell pepper', 'Tomatoes', 'Cucumber', 'Feta cheese', 'Olive Oil', 'Oregano', 'Lemon']);

foreach ($greekSalad as $ingredient) {
    echo $ingredient;
}

This can be quite useful, but only when your classes are kept small – so remember about the SOLID principles there.

Performance

Everything looks promising until now. Two examples above show us how to change ugly foreach or for loops into smart looking object oriented architecture. But the main question is – is that so clever? I have checked and the result surprised me.

I have created two pieces of code to check which method is faster – using full object iteration with SPL, or ugly loops.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$array = array();
for ($i = 0; $i < $argv[1]; $i++) {
    $array[] = rand(0, 100);
}

foreach ($array as $i => $a) {
    if (0 == $a % 2) {
        unset($array[$i]);
    }
}

return $array;

foreach ($array as $a) {}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class RandArrayIterator extends ArrayIterator
{
    public function current()
    {
        return rand(0, 100);
    }
}

class OddIterator extends FilterIterator
{
    public function accept()
    {
        return 0 == $this->getInnerIterator()->current()%2;
    }
}

$arrayIterator = new ArrayIterator(['']);
$arrayIterator = new RandArrayIterator($arrayIterator);
$arrayIterator = new InfiniteIterator($arrayIterator);
$arrayIterator = new LimitIterator($arrayIterator, 0, $argv[1]);
$arrayIterator = new OddIterator($arrayIterator);

foreach ($arrayIterator as $a){}

Both do exactly the same. Generate a big array with random values and choose only odd results. I run them for 1 000 000 records and the execution time surprised me. I wasn’t prepared for such difference:

Loop: 1,42 s

SPL: 7,82 s

Nothing says more than raw numbers. How pretty SPL Iterators may be, they cannot be such fast as the standard loop. The difference is just astonishing, so I will be careful with using Iterator since now.

The frosting on the cake

Read more

[1] PHP Documentation – SPL Iterators
[1] PHP Iterator Design Pattern I: Access to Aggregates
[2] SitePoint – Using SPL Iterators, Part 1
[3] SitePoint – Using SPL Iterators, Part 2
[4] The Standard PHP Library – SPL
[5] SPL iterators with closure context switching
[6] Introduction to SPL

Do not miss my new blog post



  • Angelina

    Shenzhen Sunper Opto Co.,ltd is a professional led high bay light manufacturer with ten years’ experience. We use Cree led and Meanwell driver, with the strength of top quality and competitive price, and with UL DLc CE RoHs Certificates,our products have a good reputation in North America and Europe.

    Products Feature:
    1.use Cree led and Meanwell driver specialize designed in led high bay light
    2.3-5 years warranty
    3.The appearance treatment in Black
    a.Black demontrates elegance, fashion and distinction.
    b.Black radiator with good heat dissipation, higher heat dissipation efficiency than others,reduce the lumens depreciation of the lamp, extend the
    life-span of lamp.
    4.The structure of lamp adopts round layered design, manifesting the uniform beauty of lamps.

    if you need more, please feel free to contact me(angelina@sunper.net) or click our website for details:http://www.sunper.net/

  • Adrian Cardenas

    Nice over view of how to use iterators & their benefits.

    However, you skipped over probably the best reason to use iterators, which is the efficiencies gained in memory.

    I tested your code (with some slight modifications for running online vs the command line) on PHP 5.6.1 & added measurements for memory usage.

    Code using loops:
    100,000 iterations – http://codepad.viper-7.com/k68YS6
    Duration: 0.095134973526001
    Memory: 0.296875 KB

    1,000,000 iterations – The script ran out of memory in the REPL environment while generating the first array

    Code using Iterators
    100,000 iterations – http://codepad.viper-7.com/ptauX3
    Duration: 0.21263003349304
    Memory: 2.15625 KB

    1,000,000 iterations – http://codepad.viper-7.com/m3cNIy
    Duration: 2.1547520160675
    Memory: 2.15625 KB

    As you can see, you see the same marked increase in time from using loops to using iterators for the 100k iteration test. Memory usage is also more. However, in the iterators test going increasing the iterations by an order of magnitude increased the duration by an order of magnitude as well, but the memory usage remained constant. Increasing another order of magnitude would increase the duration as well, but memory usage wouldn’t go up much more than the 2.2 KB being used by the script.

    • http://piotrpasich.com/ Piotr Pasich

      Hi Adrian,

      nice answer. I can tell you what happen if you run 1,000,000 iterations based on loop – this will take about 72 MB in 1.13 s.

      I have investigated the memory usage too, but the differences was too small to talk about this and was caused by using objects against simple arrays. Why the SPL Iterator’s code shows you so low memory usage? Because the memory is freed just after the last foreach. It works in the same way as in the case where you put all the code from http://codepad.viper-7.com/k68YS6 into function like this: https://gist.github.com/piotrpasich/857cd0e4a0988e11a5f4 . At the result, the memory usage is 0.58746337890625 KB.

      So, the place where you measure the memory matters.

      Thank you for the comment,
      Piotr