The question:
Is there a page somewhere that details exactly how WordPress generates slugs for URLs? I’m writing a script that needs to generate URL slugs identical to the ones WordPress generates.
The Solutions:
Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.
Method 1
As per @SinisterBeard‘s very valid comment to the question already a couple of years back now, this answer has long been outdated and the mentioned function(s) hence been replaced by a newer API:
See wp_unique_post_slug
.
Original Answer
Off the bat, I can’t give you a page/tutorial/documentation on how WP slugs are generated, but take a look at the sanitize_title()
function.
Don’t get a wrong impression by the function name, it is not meant to sanitize a title for further usage as a page/post title. It takes a title string and returns it to be used in a URL:
- strips HTML & PHP
- strips special chars
- converts all characters to lowercaps
- replaces whitespaces, underscores and periods by hyphens/dashes
- reduces multiple consecutive dashes to one
There might be edge cases where the core does something additional (you’d have to look at the source to verify that sanitize_title()
will always suffice in generating exactly the same you expect), but this should cover at least 99%, if not all, cases.
Method 2
You can use this function:
static public function slugify($text)
{
// replace non letter or digits by -
$text = preg_replace('~[^pLd]+~u', '-', $text);
// transliterate
$text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
// remove unwanted characters
$text = preg_replace('~[^-w]+~', '', $text);
// trim
$text = trim($text, '-');
// remove duplicate -
$text = preg_replace('~-+~', '-', $text);
// lowercase
$text = strtolower($text);
if (empty($text)) {
return 'n-a';
}
return $text;
}
Its kind of exactly how the wp url sanitize function works.
Method 3
Core at your service
There’s no developer mode built into WordPress aside from WP_DEBUG
, which doesn’t help you too much in this case. Basically WP uses the “Rewrite API”, which is a function based, low level wrapper for the WP_Rewrite
class, which you can read about in Codex. The global $wp_rewrite
object stands at your service to inspect it or interact with the class.
Plugins that help looking into it.
Toschos “T5 Rewrite”-Plugin and Jan Fabrys “Monkeyman Rewrite Analyzer”-Plugin will guide you your way. I’ve written a small extension for “T5 Rewrite” to smoothly integrate it with the “Monkeyman Rewrite Analyzer”, which you can find in the “T5 Rewrite” repos wikie here on GitHub.
The “Monkeyman”-plugin adds a new page, filed in the admin UI menu under Tools. The “T5 Rewrite”-plugin adds a new help tab to the Settings > Permalinks page. My extension adds the help tabs to the mentioned Tools-page too.
Here’s a screenshot of what the “T5 Rewrite”-plugins help tab content looks like.
Vorlage = Pattern | Beschreibung = Explanation | Beispiele = Examples
Notes
The “T5 Rewrite”-plugin does a wonderful job with helping you inspect the rewrite object. And it does even more: It adds new possibilities. Therefore it’s (at least in my installations) part of my basic plugins package.
Method 4
Actually, if you look core function wp_insert_post (post.php), you will see that it does the following:
$data['post_name'] = wp_unique_post_slug( sanitize_title( $data['post_title'], $post_ID ), $post_ID, $data['post_status'], $post_type, $post_parent );
$wpdb->update( $wpdb->posts, array( 'post_name' => $data['post_name'] ), $where );
The key thing to note is that uses both wp_unique_post_slug and sanitize_title:
wp_unique_post_slug( sanitize_title(
Method 5
Forgive for resuming an old question, but I had the same necessity as found out this method works perfectly for me:
$some_string = "DON'T STOP ME NOW!";
$slug = sanitize_title(sanitize_title($some_string, '', 'save'), '', 'query');
echo $slug; // dont-stop-me-now
This method uses a double sanitization.
The first one uses the save
mode, where HTML and PHP tags are stripped, and accents are removed (accented characters are replaced with non-accented equivalents).
The second query
mode ensures all spaces are replaced with dashes -
and other punctuation removed.
Hope this helps someone! 🙂
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0