Hi there
i need to convert a word doc into xml format
MS Word has its own conversion to XML. I don't think it's possible to
write your own filter for
Word's proprietary format. So there's only one possibility, and that is
to use Word automation:
prepare for bugs and untold syntax secrets.
Here's some code that might help you - but I warn you, this whole thing
never
actually made it to my desktop ...
// Don't forget to add a reference to the Interop dll
using Word = Microsoft.Office.Interop.Word;
using System.Diagnostics;
using System.Runtime.InteropServices;
[DllImport("User32.dll")]
public static extern int ShowWindowAsync(IntPtr hWnd, int swCommand);
private void SaveAsXML(string filename)
{
Word.Application wordApp = new Word.Application();
// For reasons of performace and to avoid the user
// messing things up :
wordApp.Visible = false;
wordApp.ScreenUpdating = false;
// Word needs a lot of parameters that are optional in VB
// but need to be passed in C#
Object xmlFormat = Word.WdSaveFormat.wdFormatXML;
Object f = fileName;
wordApp.Documents.Open(
ref f,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing,
ref missing);
// Here comes the actual saving part
Word.Document doc = wordApp.ActiveDocument;
doc.SaveAs(
ref f,
ref xmlFormat,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing,
ref missing);
doc.Close(
ref missing,
ref missing,
ref missing);
CleanUpWordResources();
}
private void CleanUpWordresources()
{
// BUG: the COM proxy's reference (wordApp) seems to remain available
after
// the Word application has been closed. In order to avoid wordApp !=
null
// to throw an error it is safer to first look for running processes
// on OS level
processWord = Process.GetProcessesByName(PROCESS_WORD);
if (processWord.Length != 0)
{
if (wordApp != null)
{
//// Bring Application to front and close any open
WORD windows ////
foreach (Process p in processWord)
{
ShowWindowAsync(p.MainWindowHandle, (int)
ShowWindowConstants.SW_SHOWMAXIMIZED);
}
SetForegroundWindow(processWord[0].MainWindowHandle);
wordApp.Quit(ref missing, ref missing, ref
missing);
wordApp = null;
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
}
private enum ShowWindowConstants : int
{
SW_HIDE = 0,
SW_SHOWNORMAL = 1,
SW_NORMAL = 1,
SW_SHOWMINIMIZED = 2,
SW_SHOWMAXIMIZED = 3,
SW_MAXIMIZE = 3,
SW_SHOWNOACTIVATE = 4,
SW_SHOW = 5,
SW_MINIMIZE = 6,
SW_SHOWMINNOACTIVE = 7,
SW_SHOWNA = 8,
SW_RESTORE = 9,
SW_SHOWDEFAULT = 10,
SW_FORCEMINIMIZE = 11,
SW_MAX = 11
}
There's some redundant code that brings the Word App (should you decide
to make it visible
with its Visible propert to true) to the foreground and closes it (to
notify the user that somethings is happening). But I thought I you
might need it once you start messing around with Office automation.
Storing the xml into a database shouldn't be too difficult
Success!
Joachim Van den Bogaert