We we're in the similar situation as you James. After a lot of digging this is what I came up with:
The core_url_rewrite
table is now deprecated, instead Magento EE 1.13 now stores the rewrites in enterprise_url_rewrite
.
Tables:
enterprise_*_category_rewrite
use catalog_*_entity_url_key
tables to rebuild the two rewrite tables when you run php indexer.php --reindex catalog_url_*
When you add a 'URL Redirect' in admin Catalog->URL Redirects for a custom URL it gets added to the enterprise_url_rewrite_redirect
table and the flag for Magento that the index is now outdated is entered into enterprise_url_rewrite_redirect_cl
table which when running php indexer.php --reindex url_redirect
rebuilds the enterprise_url_rewrite_redirect_rewrite
table.
Quick note, any table ending in _cl is safe to truncate, the 'CL' stands for Change Log and is used by Magento to check if re-indexing is required.
As far as the URL Key tables go, I'm still a bit clueless as to why there two URL Key entries one in catalog_*_entity_url_key
and one in catalog_*_entity_varchar
(attribute id 90), but I assume this is what happens:
When you create a new product/category Magento uses the name to generate a url_key which is placed in catalog_*_entity_url_key
AND in the catalog_*_entity_varchar
, but the primary table used by Magento is the catalog_*_entity_url_key
because if you truncate it and run php indexer.php --reindex catalog_url_*
your enterprise_*_category_rewrite
tables will be empty and products/categories in the frontend will display ugly urls i.e. http://example.com/catalog/product/view/id/123/etc/etc
(not SOE friendly) I believe the two tables are related and are used to build the enterprise_url_rewrite
table because this table stores a 'request_path' most likely the url_key inside the catalog_*_entity_varchar
table and an 'identifier' which is the primary URL Key from the catalog_*_entity_url_key
table. I could be completely wrong about url_key and varchar tables, so I'm just thinking out loud.
Anyway to successfully truncate and rebuilt all rewrite tables you can execute:
SET FOREIGN_KEY_CHECKS = 0;
TRUNCATE TABLE `core_url_rewrite`;
TRUNCATE TABLE `enterprise_catalog_category_rewrite`;
TRUNCATE TABLE `enterprise_catalog_product_rewrite`;
TRUNCATE TABLE `enterprise_url_rewrite`;
TRUNCATE TABLE `enterprise_url_rewrite_category_cl`;
TRUNCATE TABLE `enterprise_url_rewrite_product_cl`;
TRUNCATE TABLE `enterprise_url_rewrite_redirect_cl`;
TRUNCATE TABLE `enterprise_url_rewrite_redirect_rewrite`;
SET FOREIGN_KEY_CHECKS = 1;
and then run:
sudo php indexer.php --reindex catalog_url_product
sudo php indexer.php --reindex catalog_url_category
sudo php indexer.php --reindex url_redirect
If you also truncate enterprise_url_rewrite_redirect
then you will loose all of your custom redirects that you see in your admin panel, perhaps this is your goal since you were left with a ton of useless URLs. As long as you DO NOT truncate the '*_entity_url_key' tables you'll be fine.
Our story was a little different, because we had duplicate URL Keys and major problems with product names from excel imports after upgrading to 1.13 from 1.11 so I wrote this quick script to reset the catalog_product_entity_url_key
table and the URL keys and URL paths in the catalog_product_entity_varchar
table using product names. I attached the code below, but if you use it, use it at your own risk.
<?php
include_once('app/Mage.php');
Mage::app();
$dbHandle = Mage::getSingleton('core/resource')->getConnection('core_write');
$productCounter = 0;
$nameFixCounter = 0;
$vUrlKeyFixCounter = 0;
$urlPathCounter = 0;
$urlKeyCounter = 0;
$productCollection = $dbHandle->query("SELECT entity_id, sku FROM catalog_product_entity");
while($product = $productCollection->fetch()) {
$dataString = null;
$oldProductName = $dbHandle->query("SELECT value FROM catalog_product_entity_varchar WHERE entity_id = '".$product['entity_id']."' AND attribute_id = 65")->fetch();
$oldVarcharUrlKey = $dbHandle->query("SELECT value FROM catalog_product_entity_varchar WHERE entity_id = '".$product['entity_id']."' AND attribute_id = 90")->fetch();
$oldUrlPath = $dbHandle->query("SELECT value FROM catalog_product_entity_varchar WHERE entity_id = '".$product['entity_id']."' AND store_id = 0 AND attribute_id = 91")->fetch();
$oldUrlKey = $dbHandle->query("SELECT value FROM catalog_product_entity_url_key WHERE entity_id = '".$product['entity_id']."'")->fetch();
$newProductName = preg_replace('/\s+/', ' ', trim(preg_replace('/[^\x20-\x21\x23-\x2B\x2D-\xE7]/', ' ', $oldProductName['value'])));
$newUrlKey = preg_replace('/\s+/', '-', trim(preg_replace('/[^\x30-\x39\x61-\x7A]/', ' ', strtolower($newProductName))));
if (strcmp($oldProductName['value'], $newProductName)) {
echo "-[".$oldProductName['value']."]\n";
echo "+[".$newProductName."]\n";
$dbHandle->query('UPDATE catalog_product_entity_varchar SET value = "'.$newProductName.'" WHERE entity_id = "'.$product['entity_id'].'" AND attribute_id = 65');
++$nameFixCounter;
}
if (strcmp($oldVarcharUrlKey['value'], $newUrlKey)) {
echo "-[".$oldVarcharUrlKey['value']."]\n";
echo "+[".$newUrlKey."]\n";
if ($oldVarcharUrlKey['value'] === null) {
$dbHandle->query("INSERT INTO catalog_product_entity_varchar (entity_type_id, attribute_id, store_id, entity_id, value) VALUES ('4', '90', '0', '".$product['entity_id']."', '".$newUrlKey."')");
} else {
$dbHandle->query("UPDATE catalog_product_entity_varchar SET value = '".$newUrlKey."' WHERE entity_id = '".$product['entity_id']."' AND attribute_id = 90");
}
++$vUrlKeyFixCounter;
}
if (strcmp($oldUrlPath['value'], $newUrlKey.'.html')) {
echo "-[".$oldUrlPath['value']."]\n";
echo "+[".$newUrlKey.".html]\n";
if ($oldUrlPath['value'] === null) {
$dbHandle->query("INSERT INTO catalog_product_entity_varchar (entity_type_id, attribute_id, store_id, entity_id, value) VALUES ('4', '91', '0', '".$product['entity_id']."', '".$newUrlKey.".html')");
} else {
$dbHandle->query("UPDATE catalog_product_entity_varchar SET value = '".$newUrlKey.".html' WHERE entity_id = '".$product['entity_id']."' AND store_id = 0 AND attribute_id = 91");
}
++$urlPathCounter;
}
if (strcmp($oldUrlKey['value'], $newUrlKey)) {
echo "-[".$oldUrlKey['value']."]\n";
echo "+[".$newUrlKey."]\n";
if ($oldUrlKey['value'] === null) {
$dbHandle->query("INSERT INTO catalog_product_entity_url_key (entity_type_id, attribute_id, store_id, entity_id, value) VALUES ('4', '90', '0', '".$product['entity_id']."', '".$newUrlKey."')");
} else {
$dbHandle->query("UPDATE catalog_product_entity_url_key SET value = '".$newUrlKey."' WHERE entity_id = '".$product['entity_id']."'");
}
++$urlKeyCounter;
}
$report = "[".++$productCounter."] ";
$report .= "NAME: [".(strcmp($oldProductName['value'], $newProductName)?'!=':'==')."] ";
$report .= "V_KEY: [".(strcmp($oldVarcharUrlKey['value'], $newUrlKey)?'!=':'==')."] ";
$report .= "PATH: [".(strcmp($oldUrlPath['value'], $newUrlKey.'.html')?'!=':'==')."] ";
$report .= "KEY: [".(strcmp($oldUrlKey['value'], $newUrlKey)?'!=':'==')."]\n";
echo $report;
}
echo 'Total Products: ['.$productCounter.'] Names: ['.$nameFixCounter.'] V_Keys: ['.$vUrlKeyFixCounter.'] Paths: ['.$urlPathCounter.'] Keys: ['.$urlKeyCounter.']';
The code can be tweaked to use Magentos formatKey method here: http://www.magentocommerce.com/wiki/3_-_store_setup_and_management/seo/url_key_characters_conversion unfortunately I came across the wiki after I updated all the keys so I didn't bother reupdating everything again.
Hope that helps :)!
Best Answer
I agree with Tim. It's generally good practice to avoid uppercase letters in URLs anyway. Luckily, if you're using Apache you should be able to use mod_rewrite to redirect all the the incoming uppercase URLs to lowercase versions by adding the following rules to your .htaccess file:
This will handle the redirect before the request even gets to Magento. For any incoming request containing upper case letters, Apache will respond with a 301 redirect to the same URL but with all letters converted to lowercase. Here's a more in depth explanation of how this works: http://www.chrisabernethy.com/force-lower-case-urls-with-mod_rewrite/
4 lines the .htaccess file is much nicer than having to create 9500 redirect rules -- one for each url! After a while Google will remove the uppercase urls from it's index and replace them with the new lowercase urls. If you are using a different web server than Apache I'm sure there's a similar functionality available, but you'll have to do some more research.