<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Fast deserialization in Python</title>
	<atom:link href="http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/</link>
	<description>building machine learning and natural language processing tools</description>
	<lastBuildDate>Sun, 28 Feb 2010 23:20:52 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Nir</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-36</link>
		<dc:creator>Nir</dc:creator>
		<pubDate>Tue, 10 Nov 2009 08:41:47 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-36</guid>
		<description>Seems that Bob Ippolito fixed simplejson slowness. 
Retry with latest version.</description>
		<content:encoded><![CDATA[<p>Seems that Bob Ippolito fixed simplejson slowness.<br />
Retry with latest version.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Millikin</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-9</link>
		<dc:creator>John Millikin</dc:creator>
		<pubDate>Tue, 24 Mar 2009 13:59:48 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-9</guid>
		<description>(reposting a comment from Hacker News, at Joseph Turian&#039;s request)&lt;br&gt;&lt;br&gt;I&#039;m the author of jsonlib, and I registered specifically to post this message. Please, please, please do not use cjson!&lt;br&gt;&lt;br&gt;First, it is unmaintained. The latest version available was posted on August 24, 2007. When you encounter one of its myriad bugs, you&#039;ll either have to patch it yourself or pick another JSON library. Just skip the intermediate step and use another library to begin with.&lt;br&gt;&lt;br&gt;Second, it is buggy. In some cases, parsing text it just generated will return a different value from what you passed in! It&#039;s almost entirely ignorant of Unicode, and what little it tries to parse it gets wrong.&lt;br&gt;&lt;br&gt;Third, it&#039;s exceedingly non-compliant. The text it parses and generates bears only a passing resemblance to JSON. There are varying degrees of conformance to the spec between libraries, based on personal preference of the authors -- I prefer strict conformance, others less strict -- but cjson is so different as to be simply unusable.&lt;br&gt;&lt;br&gt;Yes, it&#039;s fast. I know. I wrote jsonlib partly because I was unsatisfied with simplejson&#039;s performance, and one goal (never truly achieved) was always to surpass cjson. However, speed isn&#039;t everything. As the saying goes, &quot;if I want my math performed fast and wrong I&#039;ll ask my cat&quot;.&lt;br&gt;&lt;br&gt;In my opinion, the only Python JSON libraries worth considering are:&lt;br&gt;&lt;br&gt;* simplejson -- it&#039;s in the standard library, and should therefore be considered first and most thoroughly.&lt;br&gt;&lt;br&gt;* jsonlib -- it&#039;s fast, well-tested, and standards-compliant.&lt;br&gt;&lt;br&gt;* demjson -- has several options for reliable parsing of invalid input.&lt;br&gt;&lt;br&gt;Last time I checked, jsonlib and simplejson&#039;s C extensions are neck-and-neck performance-wise. In some quick, unscientific tests, jsonlib reads faster and simplejson writes faster. However, simplejson&#039;s extensions are only used for certain subsets of input -- if you want to use an uncommon feature, performance will degrade. jsonlib has an implementation in pure C, which avoids this problem at the cost of complexity.&lt;br&gt;&lt;br&gt;Apologies for the brain-dump, but even if you skip right over it, please remember: don&#039;t use cjson.</description>
		<content:encoded><![CDATA[<p>(reposting a comment from Hacker News, at Joseph Turian&#39;s request)</p>
<p>I&#39;m the author of jsonlib, and I registered specifically to post this message. Please, please, please do not use cjson!</p>
<p>First, it is unmaintained. The latest version available was posted on August 24, 2007. When you encounter one of its myriad bugs, you&#39;ll either have to patch it yourself or pick another JSON library. Just skip the intermediate step and use another library to begin with.</p>
<p>Second, it is buggy. In some cases, parsing text it just generated will return a different value from what you passed in! It&#39;s almost entirely ignorant of Unicode, and what little it tries to parse it gets wrong.</p>
<p>Third, it&#39;s exceedingly non-compliant. The text it parses and generates bears only a passing resemblance to JSON. There are varying degrees of conformance to the spec between libraries, based on personal preference of the authors &#8212; I prefer strict conformance, others less strict &#8212; but cjson is so different as to be simply unusable.</p>
<p>Yes, it&#39;s fast. I know. I wrote jsonlib partly because I was unsatisfied with simplejson&#39;s performance, and one goal (never truly achieved) was always to surpass cjson. However, speed isn&#39;t everything. As the saying goes, &#8220;if I want my math performed fast and wrong I&#39;ll ask my cat&#8221;.</p>
<p>In my opinion, the only Python JSON libraries worth considering are:</p>
<p>* simplejson &#8212; it&#39;s in the standard library, and should therefore be considered first and most thoroughly.</p>
<p>* jsonlib &#8212; it&#39;s fast, well-tested, and standards-compliant.</p>
<p>* demjson &#8212; has several options for reliable parsing of invalid input.</p>
<p>Last time I checked, jsonlib and simplejson&#39;s C extensions are neck-and-neck performance-wise. In some quick, unscientific tests, jsonlib reads faster and simplejson writes faster. However, simplejson&#39;s extensions are only used for certain subsets of input &#8212; if you want to use an uncommon feature, performance will degrade. jsonlib has an implementation in pure C, which avoids this problem at the cost of complexity.</p>
<p>Apologies for the brain-dump, but even if you skip right over it, please remember: don&#39;t use cjson.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joseph Turian</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-8</link>
		<dc:creator>Joseph Turian</dc:creator>
		<pubDate>Mon, 23 Mar 2009 16:50:00 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-8</guid>
		<description>I am excited for a faster protobuf. In particular, haberman&#039;s &lt;a href=&quot;http://github.com/haberman/pbstream/tree/master&quot; rel=&quot;nofollow&quot;&gt;C extensions&lt;/a&gt; look promising.&lt;br&gt;&lt;br&gt;Compactness is very important for transferring data over a network.&lt;br&gt;However, during the development cycle, human readability is important and often overlooked. If all you need to do to read your data is type &#039;zcat&#039;, you are much more likely to be looking at your data, and hence more likely to catch bugs.</description>
		<content:encoded><![CDATA[<p>I am excited for a faster protobuf. In particular, haberman&#39;s <a href="http://github.com/haberman/pbstream/tree/master" rel="nofollow">C extensions</a> look promising.</p>
<p>Compactness is very important for transferring data over a network.<br />However, during the development cycle, human readability is important and often overlooked. If all you need to do to read your data is type &#39;zcat&#39;, you are much more likely to be looking at your data, and hence more likely to catch bugs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Justin</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-7</link>
		<dc:creator>Justin</dc:creator>
		<pubDate>Mon, 23 Mar 2009 12:11:32 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-7</guid>
		<description>Nice writeup :-)  Good to see that you get the same results on a more complicated data structure.&lt;br&gt;&lt;br&gt;I still have high hopes for protobuf: it can get faster, but json can&#039;t get any smaller.  At some point protobuf will be both the fastest and most compact method.</description>
		<content:encoded><![CDATA[<p>Nice writeup <img src='http://blog.metaoptimize.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />   Good to see that you get the same results on a more complicated data structure.</p>
<p>I still have high hopes for protobuf: it can get faster, but json can&#39;t get any smaller.  At some point protobuf will be both the fastest and most compact method.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jasper Spaans</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-6</link>
		<dc:creator>Jasper Spaans</dc:creator>
		<pubDate>Mon, 23 Mar 2009 09:06:02 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-6</guid>
		<description>Check if the slower simplejson install does something with locales? I&#039;ve seen grep go really slow when trying to do utf-8 stuff, which disappeared after setting LANG=C / LC_ALL=C...</description>
		<content:encoded><![CDATA[<p>Check if the slower simplejson install does something with locales? I&#39;ve seen grep go really slow when trying to do utf-8 stuff, which disappeared after setting LANG=C / LC_ALL=C&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joseph Turian</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-5</link>
		<dc:creator>Joseph Turian</dc:creator>
		<pubDate>Mon, 23 Mar 2009 04:58:47 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-5</guid>
		<description>According to &lt;a href=&quot;http://kbyanc.blogspot.com/2007/07/python-serializer-benchmarks.html&quot; rel=&quot;nofollow&quot;&gt;Extra Cheese&lt;/a&gt;, cjson has an incompatibility with simplejson in processing slashes. A fix is available from &lt;a href=&quot;http://www.vazor.com/cjson.html&quot; rel=&quot;nofollow&quot;&gt;vazor&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>According to <a href="http://kbyanc.blogspot.com/2007/07/python-serializer-benchmarks.html" rel="nofollow">Extra Cheese</a>, cjson has an incompatibility with simplejson in processing slashes. A fix is available from <a href="http://www.vazor.com/cjson.html" rel="nofollow">vazor</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joseph Turian</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-4</link>
		<dc:creator>Joseph Turian</dc:creator>
		<pubDate>Mon, 23 Mar 2009 04:10:02 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-4</guid>
		<description>An &lt;a href=&quot;http://kbyanc.blogspot.com/2007/07/python-serializer-benchmarks.html&quot; rel=&quot;nofollow&quot;&gt;older benchmark&lt;/a&gt;, showing that marshal might be the fastest.</description>
		<content:encoded><![CDATA[<p>An <a href="http://kbyanc.blogspot.com/2007/07/python-serializer-benchmarks.html" rel="nofollow">older benchmark</a>, showing that marshal might be the fastest.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joseph Turian</title>
		<link>http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/comment-page-1/#comment-3</link>
		<dc:creator>Joseph Turian</dc:creator>
		<pubDate>Mon, 23 Mar 2009 03:06:25 +0000</pubDate>
		<guid isPermaLink="false">http://blog.metaoptimize.com/?p=5#comment-3</guid>
		<description>This &lt;a href=&quot;http://www.reddit.com/r/programming/comments/811gl/comparing_thrift_protocol_buffers_and_compressed/&quot; rel=&quot;nofollow&quot;&gt;reddit thread&lt;/a&gt; has some good discussion of an earlier study.&lt;br&gt;&lt;br&gt;&lt;a href=&quot;http://www.reddit.com/r/programming/comments/811gl/comparing_thrift_protocol_buffers_and_compressed/c07ygzg&quot; rel=&quot;nofollow&quot;&gt;This author&lt;/a&gt; points out that thrift as a network protocol is much faster than JSON over HTTP.&lt;br&gt;&lt;br&gt;&lt;a href=&quot;http://www.reddit.com/r/programming/comments/811gl/comparing_thrift_protocol_buffers_and_compressed/c07yjeb&quot; rel=&quot;nofollow&quot;&gt;haberman&lt;/a&gt; points out that he is writing &lt;a href=&quot;http://github.com/haberman/pbstream/tree/master&quot; rel=&quot;nofollow&quot;&gt;C bindings&lt;/a&gt; for Python protobuf.</description>
		<content:encoded><![CDATA[<p>This <a href="http://www.reddit.com/r/programming/comments/811gl/comparing_thrift_protocol_buffers_and_compressed/" rel="nofollow">reddit thread</a> has some good discussion of an earlier study.</p>
<p><a href="http://www.reddit.com/r/programming/comments/811gl/comparing_thrift_protocol_buffers_and_compressed/c07ygzg" rel="nofollow">This author</a> points out that thrift as a network protocol is much faster than JSON over HTTP.</p>
<p><a href="http://www.reddit.com/r/programming/comments/811gl/comparing_thrift_protocol_buffers_and_compressed/c07yjeb" rel="nofollow">haberman</a> points out that he is writing <a href="http://github.com/haberman/pbstream/tree/master" rel="nofollow">C bindings</a> for Python protobuf.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
