Breaking KnowledgeTree

(For my own benefit also) In trying to aggregate some of the web servers I have deployed in VMs, I managed to break KnowledgeTree and had to re-install. On Ubuntu, it’s a real P.I.T.A. At one point, I couldn’t get past the following error:

Warning: include_once(DB/.php) [function.include-once]: failed to open stream: No such file or directory in /usr/share/knowledgetree-ce/thirdparty/pear/DB.php on line 371

The fix for the above is easy. It’s usually an indication that there’s extra kruft in the config-path file. To fix it, delete all of the lines in /usr/share/knowledgetree-ce/config/config-path except for /etc/knowledgetree-ce/config.ini.

Also, if you’re working with the Chrome browser, you’ll probably get into the condition where “control.php” downloads instead executing on the server. The point to remember is that even if you fix it, this condition will continue until to erase your browser history.

Yeah, it’s been that kind of day.

KnowledgeTree/Alfresco revisited

With apologies for the following, long ramble…

Mike Hatfield (from Alfresco) commented on my post about comparing KnowledgeTree and Alfresco. To be fair, I’d like to revise part of it here. Please keep in mind that I’m a total amateur with Document Management Systems (DMSs) and the CMIS protocol. Much of my previous post was based on first impression.

Defining my needs. I need a document management system to manage (store) and search other people’s documents, regardless of their format (doc, pdf, etc.). Both KnowledgeTree and Alfresco do this nicely, both having more or less the same open source tools sets in their backend. These include:

  • OpenOffice for file conversion
  • Lucene for indexing/searching
  • MySQL for storage of metadata

Both tools are designed for collaboration and development of documents. I need neither of these features. The feature set that I want/need is only a minuscule part of what either tool provides (the word “overkill” might be used here). However because both Alfresco and KnowledgeTree meet my requirements so well, I’m still unable to decide which tool better fits my needs.

Document descriptions. Alfresco has this feature built-in. It shows up in search results. A document description can be added to KnowledgeTree’s metadata via “Document Fieldsets” under “Document Metadata and Workflow Configuration”. I haven’t yet figured out how to get this description to show up as part of the search results, like it does in Alfresco.

File storage. Neither tool stores the file using the original filename. This practice has both its pros and cons. The big advantage is that filename collisions (a very bad condition for DMS tools) is avoided. A minor shortcoming (if it is that) is that you have to always access the file via your tool of choice (e.g., KnowledgeTree or Alfresco).

User interface. I’d confused Mike Hatfield by stating “demo built into installed software”. This statement was based on my first impressions of Alfresco’s interface. The tabs “My Alfresco”, “Company Home”, “My Home”, and “Guest Home”, along with the “Demonstration” and “Feature Tour” links, lend to the impression that the front end is intended more for a demo presentation than a working interface.

Keeping in mind that Alfresco and KnowledgeTree both have much more capability than what I needed (i.e., just document storage/search), it struck me that I’d need to put a bit of extra work on Alfresco’s web interface.

After using the tool for a bit, this impression was erroneous. It’s just that the interface takes a little time before it becomes “comfortable”.

Backups. Alfresco is written in Java. Alfresco strikes me as being easier to back up because, other than the requisite external tools (OpenOffice, dot, etc.), you only have to copy the MySQL database and the folder in which Alfresco is stored. In short, the web server is incorporated into Alfresco. It “plays well” (ignores/doesn’t conflict) with any other Apache install.

KnowledgeTree is written in PHP and relies upon an Apache instance, either upon your existing Apache install or installing one if you lack it. It takes a bit of work to wedge KnowledgeTree into a separate instance of Apache (though you really don’t need to). The end result is backing up KnowledgeTree becomes part of your process of backing up your web server, while backing up Alfresco means copying a folder (both require backing up MySQL).

Unneeded features. By this, I mean features that I don’t need, not unneeded features in the tool. Both tools have many more features (workflow, collaboration, etc.) than what I need. It’s just that, for what I do need, both tools work so wonderfully well.

Problem (if you can view it as such): Mashups of toolsets, with Alfresco or KnowledgeTree, are possible via the CMIS protocol. This means that you can connect Alfresco or KnowledgeTree to other tools such as Drupal, SugarCRM, or ProcessMaker. I already have a large time investment in specific tools (e.g., MediaWiki) and neither works well with what I have. This means that I tend to treat both as stand-alone tools. (At one point, Alfresco did have an interface for Mediawiki. Because both tools are actively updated, this interface didn’t survive upgrade.)

Mike, I did look at the Share interface and do see a lot of nice features there. It looks quite interesting and I’ll probably grow into it. At the moment, it’s much more than what I need.

To tell the truth, I still haven’t decided on which tool to go with. The differences (for me) boil down to a couple minor differences in search results. Alfresco includes the document description as part of the search results. KnowledgeTree includes an excerpt of each document’s text (ala Google) as part of the search results. Neither provides both; I want both. (Poor me!)

KnowledgeTree

A friend commented on my recent post about Alfresco, stating that I needed to try out Knowledgetree to be fair/objective in the comparison, so I built a stripped down Ubuntu 8.10 VM (I removed Apache, MySQL, and OpenOffice) and loaded the open source version of KT DMS v3.6.1.
My friend was correct. The little bit of extra effort required to run the full install of KnowledgeTree was worth it. It’s convinced me to attempt to install the less easy “source code only” version (i.e., force it to employ the Apache, MySQL, and OpenOffice instances that I’m already running).
For the test VM, installation of KnowledgeTree was straight-forward. I only had to make the installer executable. The installer asked only a handful of questions (e.g.,what port? what password?) before installing. Keep in mind that this is not a small toolset. Installation did take a few minutes. After that, I loaded a few of the more obnoxoius documents that I have on hand: Asterisk – The Future of Telephone (15 MB!), one of my point papers in Word, an ODP presentation, a PowerPoint with a large graphic, and a PNG graphic file.
One thing noticed with both: PDFs with special formatting for trademarks tends to throw both programs. This is caused by their dependency on the same set of data extraction tools. There were no searches where one tool was more dependable than the other. Both failed on searching for “PostgreSQL” in “Asterisk – The Future of Telephony“.
The differences between the two pieces of software don’t appear to be technical. I don’t have the resources to immediately test how each performs with large quantities of documents. You can be sure that I’ll gripe about it if the tool-used acts up.
It’s the web interface features that set the two apart. Each has features that I like, and features/issues that I don’t:

  • Alfresco Pros
    • allows you to add a description of the document
    • indexes content and makes it searchable
  • Alfresco Cons
    • doesn’t excerpt content as part of a search result
    • document uploads are indexed immediately (delays caused by large documents)
    • doesn’t allow for multiple categories to be associated with one document
    • file stored without uploaded filename
    • demo built into installed software
    • heavy customization expected from user
    • deleting a document, deletes the document (no recovery)
    • some plugins require manual addition of a plugin manager
    • displays the filename instead of the title as part of search results
    • runs on top of Java (I’m an old fart. I remember the Linux + Java issues.)
  • KnowledgeTree Pros
    • indexes content and makes it searchable
    • uploaded documents are placed in a queue for indexing (allows for indexing in the background)
    • allows for multiple metadata (“cloud tags”) to be associated with individual documents
    • light customization expected from user
    • allows filename change via the web interface
    • has a number of built-in tools for database and archive repair/management
    • deleting a document moves it to a hidden queue where it can later be expunged
    • built-in plugin manager
    • skinnable
    • runs on top of PHP (not sure if this is actually a plus)
  • KnowledgeTree Cons
    • doesn’t allow for a description of the document (might be countered by the “Discussion” feature for each document)
    • file stored without uploaded filename
    • (at least for version 3.6.1) The “Search and Indexing” menu (only visible to the Admin) has a number of double entries (tolerable but ugly)
  • Other
    • Both tools are also slightly different in their purposes. KnowledgeTree strikes me as striving to be more of a front-end tool than Alfresco. Of late, Alfresco has been attempting integrate with other tools (MediaWiki, Drupal, Joomla, etc.). This would be nice, but the integration didn’t carry over into the newest version. (MediaWiki integration may cause me to return to Alfresco.)
    • For those that require commercial support, both tools have it, if you’re willing to pay for it.
    • KnowledgeTree has a much cleaner front end (it feels less Web 2.0-ish even though it uses metatags)
    • Both tools provide for open source versions, while formally declaring them unsupported (it’s something that I can live with)
    • For the full-blown installs, both tools have nice interfaces

All in all, it looks like I prefer the KnowledgeTree tool. Much depends on how easy it will be to install the source code version. My only other requirement is that I need to be able to declare the storage location. Alfresco allows you to state this up front. KnowledgeTree requires installation first (I think) and then allows you to change the storage directory.
I’m still uncertain as to which tool I plan on using in the long run. Right now it’s a balance of “how hard is the source code version of KnowledgeTree to install?” vs. “how hard is it to change Alfresco’s interface and remove the demo from the install?”. I’ll keep you posted.

Alfresco

I finally buckled down and installed a document management system. I was looking at having to buy yet another hard drive to contain all of the I-gotta-save-this-it-might-be-useful stuff that builds up as a result of over 20 years of messing with computers.
I played with installs of Epiware, KnowledgeTree, and Alfresco. Of the three, Alfresco was the least evil (it was the easiest to install and does what I need). Epiware had issues with extra large PDFs and KnowledgeTree had install issues (it insisted that more instances of JDK, Tomcat, and MySQL were needed on my system).
In any case, I’ll be working my way back through all of my kruft, deleting what’s not needed, and cataloging (with Alfresco) the rest.
As always, notes for installing Alfresco are in the wiki.