One of the frequent requests we get from customers/prospects goes something like this - " Hey, we love SamePage and its cool WYSIWYG editor. But we already have hundreds of pages in our existing wiki. Manually copy-pasting the content into SamePage would be a lot of work. Can we magically get this content into SamePage? "
Well, the Universal Import Tool is the answer. It is a simple but powerful tool that allows you to import standard HTML content from the file-system into SamePage through a simple command-line utility. Since it works against standard HTML, it can be used to import HTML content from any source - open-source/commercial wikis as well as non-wiki websites.
|
Quick Links
|
Features
The basic features of the Universal Import Tool (UIT) are as follows:
- It is a command-line utility that can recursively read through folders from the file system to import HTML files as pages of a project within SamePage.
- The UIT not only imports the HTML page content, it also imports embedded images and attached files. All internal and external links are preserved.
- The UIT also ensures the preservation of the site structure mapping it to the page hierarchy within the SamePage project. The page layout is also preserved as long as layout is not determined by CSS.
- The UITscrapes out any javascript and CSS from the original content. The content within SamePage will be rendered with the default look and feel of SamePage.
- The UIT does not carry over any wiki syntax from the source. The final output in SamePage is always standard HTML. This is keeping with the general philosophy of SamePage that Wikis should be editable without the end-users having to know wiki-syntax.
- The UIT has certain intelligence to parse the HTML and extract only the relevant content removing header, navigation items etc. However, this logic can be configured for any wiki/website with help/inputs from the SamePage support team/professional services.
Current Limitations
These are some of the known limitations of the Universal Import Tool (UIT). Some of these limitations would be eliminated in future releases.
- The UIT imports only the content from other wiki sites. User information and permissions are not imported from the source.
- The UIT also does not import page comments from other wiki sites.
- The UIT imports only current version of the content from other wikis. Previous versions are ignored.
- The UIT can import content only into Projects(wikis) within SamePage. Forums and Blogs cannot be imported.
Content Import Process
The overall import of content from an existing wiki/non-wiki website is a 2 step process:
Step 1 - Crawling
The first step of crawling the website is an optional step that you need to carry out only if the content is not already available as HTML files and images in the file-system. This may be the case with a lot of wiki-sites which may have their own proprietary format of storing the content in the database or file-system. However, for static HTML web-sites, the content would already be available as HTML files and images in the web-server.
You can use any website crawler to crawl your current wiki to get static HTML and images. However, we have used
HTTrack extensively and found it to be pretty reliable in most cases.
Another copier we have used to some extent is
WebSite Ripper Copier (commercial software with free evaluation)from Tensons Software.
Here is a
Quick Guide on HTTrack with specific instructions on how to crawl using HTTrack.
Step 2 - Import
Once you have the content to be imported as static HTML files in the file system, the UIT can be run to import the content into SamePage.
Please refer to the
Step by Step Instructions for Import to download the UIT and import the content.
Here are some
Sample Screenshots of Imported Content