Yes. I have rolled my own Metaweblog API that programmatically manages wiki pages in Sharepoint 2010 and 2007.
My sources:
The service code for both SP 2010 and 2007 is pretty much identical, but there are a few caveats:
- In 2010, don't need to worry about managing wiki link markup (e.g. [[brackets]]).
- In 2007, wiki markup is converted on your request, so you have to re-convert it to Wiki markup before posting back. On posting back, you cannot use UpdateListItems, you must use the Copy service. This is because UpdateListItems will escape any wiki markup, effectively making your efforts useless.
- In our environment, we require RecordType to be filled in before checking in. Maybe this is standard? If you don't set this field, your page will remain checked out to you. So, I have a conditional that sets this field for SP2007.
- In 2010, SP adds a bunch of markup in the raw WikiField value, and if it's missing it could mess up layouts. I just insert it around the value WLW is posting, then strip it out on getting. See below.
I use the Copy service as in the first link to create AND update the wiki pages. In 2010, you can use the Lists service to update, but not to add.
I use the Imaging service to upload images automatically to a picture library.
Here is a function to replace the "ms-wikilinks" to wiki markup:
Note: I use the HTMLAgilityPack in case the markup returned is malformed. You could use Regex to do this too. I also use Microsoft Anti-XSS 4.1 library to sanitize markup.
Note 2: My UrlDecode function does not take a dependency on System.Web, taken from here.
/// <summary>
/// Sharepoint 2007 is mean and converts [[wiki links]] once the page is saved in the Sharepoint editor.
/// Luckily, each link is decorated with class="ms-wikilink" and follows some conventions.
/// </summary>
/// <param name="html"></param>
/// <returns></returns>
private static string ConvertAnchorsToWikiLinks(this string html)
{
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var anchorTags = (from d in htmlDoc.DocumentNode.Descendants()
where d.Attributes.Contains("class") && d.Attributes["class"].Value == "ms-wikilink"
select d).ToList();
foreach (var anchor in anchorTags)
{
// Two kinds of links
// [[Direct Link]]
// [[Wiki Page Name|Display Name]]
var wikiPageFromLink = UrlDecode(anchor.Attributes["href"].Value.Split('/').LastOrDefault().Replace(".aspx", ""));
var wikiPageFromText = anchor.InnerText;
HtmlNode textNode = null;
if (wikiPageFromLink == wikiPageFromText)
{
// Simple link
textNode = HtmlTextNode.CreateNode("[[" + wikiPageFromText + "]]");
}
else
{
// Substituted link
textNode = HtmlTextNode.CreateNode(String.Format("[[{0}|{1}]]", wikiPageFromLink, wikiPageFromText));
}
if (textNode != null)
{
anchor.ParentNode.ReplaceChild(textNode, anchor);
}
}
return htmlDoc.DocumentNode.InnerHtml;
}
The function to strip SharePoint's HTML is:
/// <summary>
/// Gets editable HTML for a wiki page from a SharePoint HTML fragment.
/// </summary>
/// <param name="html"></param>
/// <returns></returns>
public static string GetHtmlEditableContent(string html)
{
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
HtmlNode divNode = (from d in htmlDoc.DocumentNode.Descendants()
where d.Attributes.Contains("class") && d.Attributes["class"].Value == "ms-rte-layoutszone-inner"
select d).FirstOrDefault();
HtmlNode divNode2 = (from d in htmlDoc.DocumentNode.Descendants()
where d.Attributes.Contains("class") && d.Attributes["class"].Value.StartsWith("ExternalClass")
select d).FirstOrDefault();
if (divNode != null)
{
// SP 2010
return divNode.InnerHtml;
}
else if (divNode2 != null)
{
// SP 2007 or something else
return divNode2.InnerHtml.ConvertAnchorsToWikiLinks();
}
else
{
return null;
}
}
And finally, the function that adds that markup all back:
/// <summary>
/// Inserts SharePoint's wrapping HTML around wiki page content. Stupid!
/// </summary>
/// <param name="html"></param>
/// <returns></returns>
public static string InsertSharepointHtmlWrapper(string html, SharePointVersion spVersion)
{
// No weird wrapper HTML for 2007
if (spVersion == SharePointVersion.SP2007)
return Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(html);
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(@"<table id='layoutsTable' style='width:100%'>
<tbody>
<tr>
<td>
<div class='ms-rte-layoutszone-outer' style='width:99.9%'>
<div class='ms-rte-layoutszone-inner' style='min-height:60px;word-wrap:break-word'>
</div>
</div>
</td>
</tr>
</tbody>
</table>
<span id='layoutsData' style='display:none'>false,false,1</span>");
HtmlNode divNode = (from d in htmlDoc.DocumentNode.Descendants()
where d.Attributes.Contains("class") && d.Attributes["class"].Value == "ms-rte-layoutszone-inner"
select d).FirstOrDefault();
divNode.InnerHtml = Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(html);
return htmlDoc.DocumentNode.InnerHtml;
}
This works great.
- Pages still retain last modified and correct user
- Pages will retain all their history
- Pages are easier to manage
I am thinking of publishing my API, it's not a lot of code I think is super helpful for those of us that want to better manage our Sharepoint wikis. With WLW I get auto-image upload, better HTML editing support, and support for plugins like PreCode Snippet. It's awesome!
A colleague ended up figuring this one out, and the process ended up being generally-applicable to other web content we wanted to pull into Confluence as well. In broad strokes, the process involved:
- Using wget to suck all of the content out of FogBugz (configured so that images and attachments were downloaded properly, and links to them and to other pages were properly relativized).
- Using a simple XSLT transform to strip away the "template" content (e.g. logos, control/navigation links, etc) that surrounded the body of each page.
- (optionally) Using a perl module to convert the resulting HTML fragments into Confluence's markup format
- Using the Confluence command line interface to push up all of the page, image, and attachment data.
Note that I said "optionally" in #3 above. That is because the Confluence CLI has two relevant options: it can be used to create new pages directly, in which case it's expecting Confluence markup already, or it can be used to create new pages using HTML, which it converts to Confluence markup itself. In some cases, the Confluence CLI converted the HTML just fine; for other data sources, we needed to use the perl module.
Best Answer
Have you considered our Confluence Sharepoint connector? It sounds like that's what your after, it will allow you to use Sharepoint for all the other features, but use Confluence as your wiki http://www.atlassian.com/sharepoint/.
Seeing that you already use Jira, you could use Confluence and leverage off some level of integration between the two products.
Let me know if you have any questions.
Sherif Mansour (@sherifmansour) Confluence Product Manager