I have over 4000 posts, will querying some of them cause performance issues?

The question:

I am wondering if querying posts like this:

    <?php $args = array('numberposts' => 19, 'orderby' => 'rand' ); 
    $posts = get_posts( $args );
    foreach($posts as $post) {  
        $url = get_the_title($post->ID);
                echo '<span class="url">'.$image_url.'</span>';
                echo '<span class="ID">'.$post->ID.'</span>';  
    } 
    ?>

will cause any performance issues if I have over 4000 posts? I am not sure how the query works, but I imagine that it works by retrieving all 4000 posts, sorting them all into random positions and then lastly picks the first 19. It sounds kinda heavy hence my question.

P.S. 4000+ posts might sound a lot, but they are actually just link posts with empty content and href’s in the title.

EDIT: Final code from the answer (untested)

<?php

  $randposts = $wpdb->get_results( 
     "
     SELECT SQL_CALC_FOUND_ROWS wp_posts.*
     FROM wp_posts 
     WHERE 1=1 
         AND wp_posts.post_type = 'post' 
         AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private')
     ORDER BY RAND() DESC 
     LIMIT 0, 19
     "
 );

 foreach($randposts as $post) {  
     $url = get_the_title($post->ID);
     echo '<span class="url">'.$image_url.'</span>';
     echo '<span class="ID">'.$post->ID.'</span>';  
 } 
 ?>

The Solutions:

Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.

Method 1

The “Order By Rand() Alternative” method

You can check out this article that discuss how to optimize SQL queries when selecting N random rows. It’s called Order By Rand() Alternative method.

I’ve used it in one of my previous answers.

As far as I understand it, the trick is to introduce a special WHERE condition to cut down the rows before the slow ORDER BY RAND() kicks in. In your case it could be something like:

WHERE RAND() < ( SELECT ( ( 19 / COUNT(1) ) * 10 ) FROM wp_posts )

where COUNT(1) should be faster than COUNT(*) on InnoDB.

One can use this trick by modifying the SQL generated from the WP_Query(), through the posts_request filter.

This might not be relevant for only 4000 posts, but this might be useful when dealing with hundreds of thousands of posts.

To address your question, I would recommend you to profile your queries, with – and without the random ordering, just as @Wyck proposed in a comment.

If you use numberposts equal to 19, then you will only fetch 19 rows from the posts table, but not 4000. On the other hand if numberposts is equal to -1, your query will fetch the whole table.

The generated SQL from your get_posts( $args ) query will probably look something like this:

SELECT SQL_CALC_FOUND_ROWS wp_posts.*
    FROM wp_posts 
    WHERE 1=1 
        AND wp_posts.post_type = 'post' 
        AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private')
    ORDER BY RAND() DESC 
    LIMIT 0, 19

where the parameters numberposts (or posts_per_page) and paged control the limit part of the generated query.

If you want to test the trick, try:

SELECT SQL_CALC_FOUND_ROWS wp_posts.*
    FROM wp_posts 
    WHERE 1=1 
        AND wp_posts.post_type = 'post' 
        AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private')
        AND RAND() < ( SELECT ( ( 19 / COUNT(1) ) * 10 ) )
    ORDER BY RAND() DESC 
    LIMIT 0, 19

You could, for example, test the restriction explicitly with:

SELECT wp_posts.*
    FROM wp_posts 
    WHERE wp_posts.post_type = 'post' 
        AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private')
        AND RAND() < 0.0475
    ORDER BY RAND() DESC 
    LIMIT 0, 19

where ( 19 / 4000 ) * 10 ) = 0.0475. But this is just an example that you can play with further and adjust to your needs.

Method 2

get_posts is executed as a single SELECT query on WordPress DB. So pulling 19 random posts from 4000+ posts will not cause any performance issues, it’s almost similar to pulling 19 posts from 100 posts blog.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Comment