The Artima Developer Community
Sponsored Link

.NET Buzz Forum
How to load HTML into mshtml.HTMLDocumentClass with UCOMIPersistFile and my ignorance

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Scott Hanselman

Posts: 1031
Nickname: glucopilot
Registered: Aug, 2003

Scott Hanselman is the Chief Architect at Corillian Corporation and the Microsoft RD for Oregon.
How to load HTML into mshtml.HTMLDocumentClass with UCOMIPersistFile and my ignorance Posted: Jun 25, 2004 9:11 PM
Reply to this message Reply

This post originated from an RSS feed registered with .NET Buzz by Scott Hanselman.
Original Post: How to load HTML into mshtml.HTMLDocumentClass with UCOMIPersistFile and my ignorance
Feed Title: Scott Hanselman's ComputerZen.com
Feed URL: http://radio-weblogs.com/0106747/rss.xml
Feed Description: Scott Hanselman's ComputerZen.com is a .NET/WebServices/XML Weblog. I offer details of obscurities (internals of ASP.NET, WebServices, XML, etc) and best practices from real world scenarios.
Latest .NET Buzz Posts
Latest .NET Buzz Posts by Scott Hanselman
Latest Posts From Scott Hanselman's ComputerZen.com

Advertisement

What a weird one.  I'm looking at the source for NDoc.Document.HtmlHelp2.Compiler.HtmlHelpFile.  It uses the Microsoft.mshtml interop Assembly to load an HTML file into the HTMLDocumentClass for easy parsing.

It's code looks like this (DOESN'T WORK):

private HTMLDocumentClass GetHtmlDocument( FileInfo f )
{
  HTMLDocumentClass doc = null;
  try
  {
    doc = new HTMLDocumentClass();
    UCOMIPersistFile persistFile = (UCOMIPersistFile)doc;
    persistFile.Load( f.FullName, 0 );
    int start = Environment.TickCount;
    while( doc.body == null ) 
    {
      if ( Environment.TickCount - start > 10000 )
      {
        throw new Exception( string.Format( "The document {0} timed out while loading", f.Name ) );
      }
    }
  }
}
>

I went searching as it was taking up 100% CPU for an hour and never completed.  Now I know why! :)

What's weird is this, the only way I could get it to work (as IPersistFile is loading on another Thread) was with this change (NOW IT WORKS):

private HTMLDocumentClass GetHtmlDocument( FileInfo f )
{
  HTMLDocumentClass doc = null;
  try
  {
    doc = new HTMLDocumentClass();
    UCOMIPersistFile persistFile = (UCOMIPersistFile)doc;
    persistFile.Load( f.FullName, 0 );
    int start = Environment.TickCount;
    while( doc.readyState != "complete" )
  

     
System.Windows.Forms.Application.DoEvents();
      if ( Environment.TickCount - start > 10000 )
      {
        throw new Exception( string.Format( "The document {0} timed out while loading", f.Name ) );
      }
    }
  }
}

When I Reflector into DoEvents() I can see that it's doing more than a Sleep(0) (yield), it's actually running the message pump.  Am I missing something?  Apparently IPersistFile needs the message pump?  Well, it works, but it's gross.

>

Read: How to load HTML into mshtml.HTMLDocumentClass with UCOMIPersistFile and my ignorance

Topic: Hotmail to offer 250MB free storage Previous Topic   Next Topic Topic: On to TechEd Europe

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use