decompression problem

  • Thread starter Frederik Vanderhaegen
  • Start date
F

Frederik Vanderhaegen

Hey,

I've written a http sniffer that monitors the traffic on port 80.
This application works like it should be but there is one small problem.
Suppose I run the application and I surf to www.google.be then I retrieve
all the HTTP messages (Response and Request)
but the content of the message is gzip encoded (http compression).

Here you see the original message:

HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip
Server: GWS/2.1
Content-Length: 1763
Date: Sun, 01 Oct 2006 09:22:42 GMT

< ÿWërÛ¸þï§@èF×'(ÛqÖµDf'l.înÓNsíL[·<$±
-{=~zþècì<õ? %1-´òX"Às¿|8~¦äѼ sFó %.1Õ-5»
½D
ÂOÍMiW¡gàÚ-qF',*
&üùÓÛñTÍ
3¢wRææ[͵¹±?OÆã½X¦7#"Zè¨MSÛ EZ3Z2~sN£|¤©Ðc
Sew{f¿Âùñ´ºÆ­åm"¹TçûÓir·7GóÀ?ß>ëD±ÊàfU"Õ"1L
¢³¡>ʤ.ÑöI6YN2\è¡?»Û%|1¬I6Jh:JÌ(IG:÷oY6\ó²'æ ýÛ+ª¬~Håj"')üüñâµ,+)'¼
;6Ï è"V0³üuèyÍfdY+ªhÙ­QýffeÄså[ò>"Þ!Ø?,SÓ?Áþä»`äy¾¿Ùº< ò'÷ôø.ç7$ó·´ $k¤ànóõùÛJøÜØ3+raÝú­'Ð Pû MÃOÞa'ópÍ}^ÜÆ1ß.ZI©XãXøLæÙGúýïÿ¾þñmýþÃÉûàê|v§ÀÔJ£jÀÄÁÔîan»o®Dmý8oj
ÜÏs±õbAØáL,Ü"IÈ.[zѳ~P·Ê²?PSY$újëfY.TåL"'Ä=ÀòÂ"'Ñ<ÁäÂÒ¦1K.,
§$ÎuE&òvUÑ4u«KMM§O'Ͳ¦Vw.Be.!WSVÑÜ7±ÅZ¢9¶z,,ôlêó X­V"¼i¤I"Ë?&?¬.ÑÁOÍ{a>'?BKÞ§Z!<Ü<^RQOAÌSbâIí-µK'ÖÏgÖÈhÎÊ-ÅzÁÑ6_ÐÄ¢ Iß÷~\ºpíOÃ.[ÄÒY:å¶È\YE5äX¢­!§g)ßY9Êx­a(ûkòZ¶äN¿÷?:Þôüz]^?ÄÄ"S­Ãb-ûX{X×Hß3
yuOûÉN7NZr¥túÛskÛçyfÿHfYáizW×Ö¢WÀsöÛoa_-NW±Âæ̤* m
4 4P.DÐ.F*÷Ay©??°vk¨.Ñ¿¹®È`ð0®"CfYØfis?Ë{¤Ãâc
j°x7¬a,ÒáúOÙÐ[?zÿ[Ï¿mÅ.ø¶fQ<|Î$Kø
~÷Ïï<dT-(ë¸?-FÄ7÷jv×aÙÑ
ûvy¶®ÝpGïÁ <MOMÒ@ĺsÝÿFdqé_¶`IÕ<#hP¸bÂákÎ'EèµÖcSLÁ´?CÉb@A«|³¢>r%ëJoÁS®ª¾þüQý?0'¶f?,
úÿlÀ|n[ VºoxÔ, ê.~To¼Að? ÃÁYÙj×·¶)*T˧è²-3?̵Øe8oÖ.â0áøôikæ
t'Um^êÂ,¥x:¸f-8icÒÑ"ôsfÈ­Ðé³3W|§§Z|ÙRãYÑÌy "_%,@x
6lëÒu\2Óqõ?ÈØ^w°\,H* oËÅ?çb'乶ÙSÉvG_M"¹?¦WT

When I copy the encoded content into notepad and save it as a gz file.
If I try to decompress it, I receive "CRC is invalid".
I've tried several things but nothing seems to work.

Has anybody how I can decompress the content of this message so I retrieve
the html code?

Thx in advance

Frederik
 
J

Jon Slaughter

Frederik Vanderhaegen said:
Hey,

I've written a http sniffer that monitors the traffic on port 80.
This application works like it should be but there is one small problem.
Suppose I run the application and I surf to www.google.be then I retrieve
all the HTTP messages (Response and Request)
but the content of the message is gzip encoded (http compression).

Here you see the original message:

HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip
Server: GWS/2.1
Content-Length: 1763
Date: Sun, 01 Oct 2006 09:22:42 GMT

< ÿWërÛ¸þï§@èF×'(ÛqÖµDf'l.înÓNsíL[·<$±
-{=~zþècì<õ? %1-´òX"Às¿|8~¦äѼ sFó %.1Õ-5»
½D
ÂOÍMiW¡gàÚ-qF',*
&üùÓÛñTÍ
3¢wRææ[͵¹±?OÆã½X¦7#"Zè¨MSÛ EZ3Z2~sN£|¤©Ðc
Sew{f¿Âùñ´ºÆ­åm"¹TçûÓir·7GóÀ?ß>ëD±ÊàfU"Õ"1L
¢³¡>ʤ.ÑöI6YN2\è¡?»Û%|1¬I6Jh:JÌ(IG:÷oY6\ó²'æ
ýÛ+ª¬~Håj"')üüñâµ,+)'¼ ;6Ï
è"V0³üuèyÍfdY+ªhÙ­QýffeÄså[ò>"Þ!Ø?,SÓ?Áþä»`äy¾¿Ùº< ò'÷ôø.ç7$ó·´
$k¤ànóõùÛJøÜØ3+raÝú­'Ð Pû
MÃOÞa'ópÍ}^ÜÆ1ß.ZI©XãXøLæÙGúýïÿ¾þñmýþÃÉûàê|v§ÀÔJ£jÀÄÁÔîan»o®Dmý8oj
ÜÏs±õbAØáL,Ü"IÈ.[zѳ~P·Ê²?PSY$újëfY.TåL"'Ä=ÀòÂ"'Ñ<ÁäÂÒ¦1K.,
§$ÎuE&òvUÑ4u«KMM§O'Ͳ¦Vw.Be.!WSVÑÜ7±ÅZ¢9¶z,,ôlêó
X­V"¼i¤I"Ë?&?¬.ÑÁOÍ{a>'?BKÞ§Z!<Ü<^RQOAÌSbâIí-µK'ÖÏgÖÈhÎÊ-ÅzÁÑ6_ÐÄ¢
Iß÷~\ºpíOÃ.[ÄÒY:å¶È\YE5äX¢­!§g)ßY9Êx­a(ûkòZ¶äN¿÷?:Þôüz]^?ÄÄ"S­Ãb-ûX{X×Hß3
yuOûÉN7NZr¥túÛskÛçyfÿHfYáizW×Ö¢WÀsöÛoa_-NW±Âæ̤* m 4
4P.DÐ.F*÷Ay©??°vk¨.Ñ¿¹®È`ð0®"CfYØfis?Ë{¤Ãâc
j°x7¬a,ÒáúOÙÐ[?zÿ[Ï¿mÅ.ø¶fQ<|Î$Kø
~÷Ïï<dT-(ë¸?-FÄ7÷jv×aÙÑ ûvy¶®ÝpGïÁ
<MOMÒ@ĺsÝÿFdqé_¶`IÕ<#hP¸bÂákÎ'EèµÖcSLÁ´?CÉb@A«|³¢>r%ëJoÁS®ª¾þüQý?0'¶f?,
úÿlÀ|n[ VºoxÔ, ê.~To¼Að? ÃÁYÙj×·¶)*T˧è²-3?̵Øe8oÖ.â0áøôikæ
t'Um^êÂ,¥x:¸f-8icÒÑ"ôsfÈ­Ðé³3W|§§Z|ÙRãYÑÌy "_%,@x
6lëÒu\2Óqõ?ÈØ^w°\,H* oËÅ?çb'乶ÙSÉvG_M"¹?¦WT

When I copy the encoded content into notepad and save it as a gz file.
If I try to decompress it, I receive "CRC is invalid".
I've tried several things but nothing seems to work.


Because when you cut and paste you are not copying all the characters. There
are many control characters that won't show up such as tabs(0x10 or
something), line feeds(0x9 I guess), etc...

If you stored the msg as a binary then you can use some hex editor to remove
the text header and it might work. (or just try to save the body as a binary
file with .zip extension and then open it and it should work)
 
J

Joerg Jooss

Thus wrote Chris,
You can use the GZipStream class to decompress the response stream:
http://msdn2.microsoft.com/en-us/library/system.io.compression.gzipstr
eam.aspx

Or take a look at Fiddler:
http://www.fiddlertool.com/fiddler/
This is an HTTP sniffer with built-in support for gzip decompression.

In .NET 2.0, HttpWebResponse can decompress HTTP messages automatically,
if you set HttpWebRequest.AutomaticDecompression to an appropriate value
(i.e. anything other than DecompressionMethods.None).

Cheers,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top