How does WordPress generate URL slugs?

The question:

Is there a page somewhere that details exactly how WordPress generates slugs for URLs? I’m writing a script that needs to generate URL slugs identical to the ones WordPress generates.

The Solutions:

Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.

Method 1

As per @SinisterBeard‘s very valid comment to the question already a couple of years back now, this answer has long been outdated and the mentioned function(s) hence been replaced by a newer API:

See wp_unique_post_slug.

Original Answer

Off the bat, I can’t give you a page/tutorial/documentation on how WP slugs are generated, but take a look at the sanitize_title() function.

Don’t get a wrong impression by the function name, it is not meant to sanitize a title for further usage as a page/post title. It takes a title string and returns it to be used in a URL:

  • strips HTML & PHP
  • strips special chars
  • converts all characters to lowercaps
  • replaces whitespaces, underscores and periods by hyphens/dashes
  • reduces multiple consecutive dashes to one

There might be edge cases where the core does something additional (you’d have to look at the source to verify that sanitize_title() will always suffice in generating exactly the same you expect), but this should cover at least 99%, if not all, cases.

Method 2

You can use this function:

static public function slugify($text)
{
  // replace non letter or digits by -
  $text = preg_replace('~[^pLd]+~u', '-', $text);

  // transliterate
  $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);

  // remove unwanted characters
  $text = preg_replace('~[^-w]+~', '', $text);

  // trim
  $text = trim($text, '-');

  // remove duplicate -
  $text = preg_replace('~-+~', '-', $text);

  // lowercase
  $text = strtolower($text);

  if (empty($text)) {
    return 'n-a';
  }

  return $text;
}

Its kind of exactly how the wp url sanitize function works.

Method 3

Core at your service

There’s no developer mode built into WordPress aside from WP_DEBUG, which doesn’t help you too much in this case. Basically WP uses the “Rewrite API”, which is a function based, low level wrapper for the WP_Rewrite class, which you can read about in Codex. The global $wp_rewrite object stands at your service to inspect it or interact with the class.

Plugins that help looking into it.

Toschos “T5 Rewrite”-Plugin and Jan Fabrys “Monkeyman Rewrite Analyzer”-Plugin will guide you your way. I’ve written a small extension for “T5 Rewrite” to smoothly integrate it with the “Monkeyman Rewrite Analyzer”, which you can find in the “T5 Rewrite” repos wikie here on GitHub.

The “Monkeyman”-plugin adds a new page, filed in the admin UI menu under Tools. The “T5 Rewrite”-plugin adds a new help tab to the Settings > Permalinks page. My extension adds the help tabs to the mentioned Tools-page too.

Here’s a screenshot of what the “T5 Rewrite”-plugins help tab content looks like.

enter image description here

Vorlage = Pattern | Beschreibung = Explanation | Beispiele = Examples

Notes

The “T5 Rewrite”-plugin does a wonderful job with helping you inspect the rewrite object. And it does even more: It adds new possibilities. Therefore it’s (at least in my installations) part of my basic plugins package.

Method 4

Actually, if you look core function wp_insert_post (post.php), you will see that it does the following:

$data['post_name'] = wp_unique_post_slug( sanitize_title( $data['post_title'], $post_ID ), $post_ID, $data['post_status'], $post_type, $post_parent );

$wpdb->update( $wpdb->posts, array( 'post_name' => $data['post_name'] ), $where );

The key thing to note is that uses both wp_unique_post_slug and sanitize_title:

wp_unique_post_slug( sanitize_title( 

Method 5

Forgive for resuming an old question, but I had the same necessity as found out this method works perfectly for me:

$some_string = "DON'T STOP ME NOW!";
$slug = sanitize_title(sanitize_title($some_string, '', 'save'), '', 'query');
echo $slug; // dont-stop-me-now

This method uses a double sanitization.

The first one uses the save mode, where HTML and PHP tags are stripped, and accents are removed (accented characters are replaced with non-accented equivalents).

The second query mode ensures all spaces are replaced with dashes - and other punctuation removed.

Hope this helps someone! 🙂


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Comment