OneNote 2007 - The HTML Importer
Most recently I've worked on an HTML importer for OneNote. Originally, I had plans to build a Firefox extension that would allow right clicking on a web page and selecting a "Send to OneNote" option. After doing some initial research I decided that was going to be more effort than I could spare right now. Instead I compromised and built a tool that would work with both Firefox and IE, but the downside is that it is a two step process.
The code I've written will be easily adaptable to a Firefox extension if I ever get around to that part of the project. I believe the key will be to take the save complete pages code from chrome://browser/content/contentAreaUtils.js in the core of Firefox and adapt it to the needs of the extension.
Note: In order to use this application you will need to copy JPHOneNoteUtilities.dll into C:\Windows\assembly, or ensure the file is in the same directory as OneNoteHTMLImporter.
Let's look at how OneNoteHTMLImporter.cs works.
OneNote has limited support for HTML pages, but it seems to understand some formatting directives from styles sheets. (Does anyone know what the HTML specifications are for OneNote?) My objective was to get all of the content from a web page onto the local computer. This will keep you from being tied to external web sites in order to render the page correctly. It will also give you a local copy of the information in case the page ever goes away. In order to do this, you need to use the "Save Pages As; Web Page, complete" option in your browser. On the top browser menu, Click 'File', and select the "Save Pages As" option.

This will bring up a dialog box, which prompts you for where to save the web page. Once you've selected the location for the files, you need to make sure the "Save as type:" is set to "Web Page, complete". Then just click the save button.

Now you can double the OneNote HTML Importer application. It will present you with a dialog to pick the file you just saved from the browser. Note: You can include a directory path in the shortcut and this will be the default place it looks for files.

In the dialog box you will see the filename that you save, and also a directory. You want to select the .htm or .html file that you saved. You can ignore the directory, this contains all of the additional files (images, style sheets, etc.) that were found in the web page; they will automatically be handled by the HTML importer.
Click the image to the left to see a larger version of an imported web page. You can view the actual page here.
When the HTML Importer runs it will create a directory in OneNote's Default Notebook Location called HTML File Storage. If you aren't sure where this is in OneNote you can go to Tools -> Options -> Save. In this location a unique sub-directory (the name comes from a call to System.Guid.NewGuid()) is created for those files. The files are then moved from their saved location to this one. The main html file is parsed and modified so all of the links point to the files in their new location. It also parses the HTML page for the <title> tag, and uses that as the title for the page. The HTML page is then inserted and embedded into a OneNote page.
The one down side to maintaining the files outside of OneNote is that they are not tied to the OneNote page that is created. So if you delete the page from OneNote the directory with the extra files will stick around.
OneNote takes all of the external information it can use and embeds that into the page. The only reason to maintain an external copy of the information is so that you can render the page in a web browser.
Maybe the thing to do would be to add a checkbox to open dialog box that would allow you to pick whether or not you wanted the page to be accessible outside of OneNote.
This software is distributed on an "AS IS" basis, without warranties or conditions of any kind, either express or implied.


Comments [ 9 ]
"Clip to OneNote" is a FF Extension that will Send To. It's available on http://www.OneNotePowerToys.com
– AdminID @ 10:45 PM on Dec 3, 2007
hi there! just what I was looking for... first steps go allright, but then I get erormessage:
Error I_._._
Directory move failed: D:\tramdsm\bureaublad\onenote-2007-the-importer-files
-> D:\tramdsm\tekst\OneNote Notebooks\HTML File
Storage\eal bdf28-4baO-48ab-b729-38fc4de54edS\onenote-2007-the-importer_fil
es
when I click ok the page text gets inserted -but without the pics of course. Any idea?
the onenote importer doesn't run at all, so maybe it's just something I did wrong with the placement of the dll? (couldn't get it to be placed in \assembly)
thanks for your work!
– Mitchke @ 6:53 AM on Dec 4, 2007
I have uploaded new versions of the program to see if I can get some more information about the error you are running into.
Re-download the following files:
http://www.stratusnine.com/cgi-bin/download.cgi/JPHOneNoteUtilities.dll
http://www.stratusnine.com/cgi-bin/download.cgi/OneNoteHTMLImporter.exe
Try the HTML Importer again, and send me any error output you get.
As for the OneNoteImporter, it is a no frills program and does not produce any screen output. It is important that you set the directory for it to look for files in either by modifying the shortcut that is calling it, or by supplying the directory on the command line:
onenoteimporter c:\place\to\find\dirs
Another important thing to note is in the location you specify (c:\place\to\find\dirs from the example above) is expecting only sub-directories to be here. Files get ignored at the top level. However, each sub-directory is processed individually and should become an Unfiled OneNote page.
I know these programs are rough around the edges -- no-one ever showed interest before, so I never took the time to improve them ...
--Jamie
– Jamie @ 10:22 AM on Dec 4, 2007
Hello, I just wanted to send my thanks for the HTML Importer for OneNote. It works great for all of the web pages I have tried so far. The appearance of pages imported to onenote is much better than using the "clip to onenote" ff extension and retain their links and options that are not available if you simply print a webpage to onenote. Thanks!!
– Jason @ 7:17 PM on Dec 5, 2007
Thank you for developing this. I have been looking for something like this. I am having problems getting it to work but I suspect that it is because I am running the older version of onenote -- will this work with that version or do I need to upgrade.
– niki kircher @ 11:31 AM on Feb 29, 2008
I have just run across this item and am very intrigued by it. The problem is that when I run the installer it crashes. No error message or anything, just that the program crashes. I am running Vista Ultimate 64 bit and OneNote 2007 and have complete administrators privileges. I'd love to work with this add-in so if you have any suggestions they would be greatly appreciate.
– Terry Allan Bennett @ 5:14 PM on Jul 7, 2008
@Terry Allan Bennett
I've seen that happen based on a couple of different conditions.
1. The JPHOneNoteUtilities.dll is not located in the same directory as OneNoteHTMLImporter.exe.
2. The program is being run from a network drive or some place other than the "My Computer" security zone.
I hope this helps.
– Jamie @ 5:29 PM on Jul 7, 2008
@niki kircher
These utilities only work with Onenote 2007 RTM. The API was not available for Onenote 2003. Also, the API or more specifically the XML namespace changed between 2007 beta and 2007 RTM so these utilities do not work with the 2007 beta either.
– Jamie @ 5:34 PM on Jul 7, 2008
i downloaded the newest version, and try to import website from Opera "save as" - results in empty notebook, and from FF - results in this:
http://62.69.200.53/blad_importer.png
can you help me?
– Robert @ 7:19 AM on Sep 5, 2008
Post a comment