The Storyteller

removing empty elements from an array, the php way :)

September 16, 2009 · 23 Comments

removing empty elements from array is most likely a very common task for everyday programming. different people work on it differently. some runs N times loop and some other tries by finding offset and then by unsetting that. let me show you some common approach of doing it and possibly the best way too :) , needless to say, in a php way – most of the time such a microsecond does not matter, but well, its still good to find out the best solution :)

first of all, lets write a function which will create a random array with empty elements

function generateRandomArray($count=10000)
{
    $array = range(0,($count-1));
    for($i = 0;$i<1000;$i++)
    {
        $offset = mt_rand(0,$count);
        $array[$offset] = "";
    }
    return $array;
}

now lets see some different approaches to get it done.
probably most common approach

$array = generateRandomArray();
$len = count($array);
$start = microtime(true);
for ($i=0;$i< $len;$i++)
{
    if(""==$array[$i]) unset($array[$i]);
}
$end = microtime(true);
echo ($end-$start);

you can see the output varies from 0.13-0.14 seconds for an array with 10000 elements in a 2.66 ghz core 2duo machine running mac osx 10.5.8 with 4GB ram.

here is another better approach using array_diff()

$array = generateRandomArray();
$start=  microtime(true);
$empty_elements = array("");
$array = array_diff($array,$empty_elements);
$end = microtime(true);
echo ($end-$start);

this one takes 0.052-0.059 seconds and surely is a significant improvement over the last one

here is another improved version using array_filter()

$array = generateRandomArray();
$start=  microtime(true);
$array = array_filter($array);
$end = microtime(true);
echo ($end-$start);

it takes like 0.002 seconds to complete :) – pretty good one, eh? (thanks damdec, for reminding about it)

and this is the last one which was my favorite one using array_keys() taking advantage of the optional search_values parameter :)

$array = generateRandomArray();
$start=  microtime(true);
$empty_elements = array_keys($array,"");
foreach ($empty_elements as $e)
unset($array[$e]);
$end = microtime(true);
echo ($end-$start);

this is an amazing improvement over the previous solutions, it takes only around 0.0008 – 0.0009 seconds for an array of 10000 elements.

i hope you enjoyed this post with micro benchmarks :D – happy phping

Categories: PHP · Tricks · howto · performance
Tagged: , , ,

23 responses so far ↓

  • uzzal // September 16, 2009 at 9:28 pm | Reply

    I think there is something wrong at second code example on line 6.
    if(emptyempty($v)) unset($array[$i]);
    from where $v came from?

  • damdec // September 16, 2009 at 9:33 pm | Reply

    What about array_filter() ?

  • hasin // September 16, 2009 at 9:33 pm | Reply

    @uzzal that was a silly copy+paste+edit mistake :P

    and emptyempty comes for syntax highlighter bug. removed it :)

  • hasin // September 16, 2009 at 9:39 pm | Reply

    @damdec – added, thanks :)

  • uzzal // September 16, 2009 at 9:52 pm | Reply

    hasin vai, i know that is a copy paste mistake ;-) another thing at last code example on line 3
    i think replacing this line
    $empty_elements = array_keys($array,”");
    with this
    $empty_elements = array_keys($array,”);
    will improve performace a little ;-)

  • hasin // September 16, 2009 at 9:58 pm | Reply

    @uzzal – yup, slight improvement because of single quote – its now around 0.0015

  • anonymous // September 17, 2009 at 12:53 am | Reply

    You can’t compare the results, because you always create a different array for each test…

  • hasin // September 17, 2009 at 1:02 am | Reply

    that is why i’ve mentioned the result in a range :)

  • anonymous // September 17, 2009 at 1:11 am | Reply

    That will not help, because you set the empty elements randomly.
    There could be the (extreme) case, that only one element is empty, because mt_rand could produce always the same number in 1000 runs.
    You should generate the array once and duplicate it for each test…

  • Jeff // September 17, 2009 at 8:09 am | Reply

    @anonymous, that’s left as an exercise for the reader: write one bit of code that calls generateRandomArray() ONCE; use something like http://us3.php.net/manual/en/ref.array.php#71119 to create new, pristine copies of that array which can then be run against the individual benchmarks. That way, you’ll be guaranteed that each benchmark is running against the same data. Remember to turn off your caching first…

  • Safique Ahmed Faruque // September 17, 2009 at 11:52 am | Reply

    After deleting cache and using the same array for a machine of 1GB RAM and DUAL CORE 2.20 GHz each

    1. N times loop : 0.00297999382019
    2. Using Array diff : 0.108612060547
    3. Using Array filter : 0.0038890838623
    4. Using Array Keys : 0.000926971435547

  • Dynom // September 17, 2009 at 12:03 pm | Reply

    Not entirely sure what this blog is really about..
    But I would stick with array_filter(), it is fast (and in my tests faster then your tests) and easier. The only thing to be aware of is that array_filter doesn’t only filter ” but on anything evaluating to false [1] so it might not do exactly what you want.

    Thanks for the laugh on ” vs “” btw !

    [1] http://docs.php.net/manual/en/language.types.boolean.php#language.types.boolean.casting

  • Jórg // September 17, 2009 at 2:39 pm | Reply

    Yeah, I don’t really see the point either. array_filter() is designed for the job, and it’s clear what it does. Even if your alternative is faster (the jury is out on that) it isn’t worth compromising the legibility of your code for a bazillionth of a nanosecond.

    http://www.c2.com/cgi/wiki?PrematureOptimization

  • hasin // September 17, 2009 at 3:17 pm | Reply

    @Jórg, @Dynom – thanks for your comments. the code was initially written te remove both ” ” (thats a blank space,  ) and an empty element where array_filter doesnt work without a callback. and for the array with 100000 elements or bigger, yeah the time does matter.

    i’ve written that “most of the time such microseconds” doesn’t matter, but hey – optimization is the all about microseconds, eh?

    :)

    Thanks for your comments :)

  • Tiwehc // September 17, 2009 at 8:39 pm | Reply

    I really don’t understand how you can compare the times. I get that you would like the user to go off and do their own benchmarking so these tests run on the same data set but then, what is the reason for the microsecond timers at all?

    Given that you are referring to the methods as “improvements”, you are misleading readers.

    Tiwehc

  • Jeff // September 17, 2009 at 11:22 pm | Reply

    Guys, I do believe that we’re past the point where we’ve put lipstick on the pig while watching the rubble bounce. +1 for Jórg’s comment…

  • arifulnr // September 19, 2009 at 12:24 am | Reply

    boss three post in 3 days. how long we are going to take this rate (you are not starting Leevio )

  • Max’ Lesestoff zum Wochenende | PHP hates me - Der PHP Blog // September 19, 2009 at 10:57 am | Reply

    [...] tutorial author – David Barnes @ Packt Die fehlenden autor guidelines hier zusammengefasst. removing empty elements from an array, the php way « The Storyteller Verschiedene Wege wie man leere Array Elemente löschen kann, mit kleinem MicroBenchmark CSS3 [...]

  • Alexandr // September 20, 2009 at 2:24 am | Reply

    I think you need to take more attention to manual section about array, as you have been noticed about array_filter you can use.

    array_walk()
    array_map()

    so we can just do:
    $array = generateRandomArray();
    $start= microtime(true);
    $array = array_map(‘empty’, $array);
    $end = microtime(true);
    echo ($end-$start);

    or use custom function to walk/map array, also you are free to use strlen or is_null as parameter for function name. Don’t forget that you can pass parameters to walk/map

  • maSnun // September 24, 2009 at 10:22 am | Reply

    But using a callback function would take more resources. So, array_map() or array_walk() might not come to any help :(

  • PHP Array // September 26, 2009 at 7:10 pm | Reply

    If you dont need to be selective on the key, for example if you only want to remove the first or last element, you could also use the native php functions: array_shift() and array_pop() – I have a feeling these would provide faster results.

  • PHP Development // November 21, 2009 at 7:59 pm | Reply

    array_map() map array_walk() will both slow down the cycle by a lot.

  • Alexandr // November 22, 2009 at 1:19 am | Reply

    “a lot”, isn’t tests results
    please provide your tests result, script example and include information about PHP version.

Leave a Comment