<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>eric the fruitbat &#187; Comp*</title>
	<atom:link href="http://www.cogitolingua.net/blog/category/comp/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.cogitolingua.net/blog</link>
	<description>Sounding out the Noosphere.</description>
	<lastBuildDate>Fri, 03 Feb 2012 23:40:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>The Good IR: Other Control Flow Structures</title>
		<link>http://www.cogitolingua.net/blog/2012/02/01/the-good-ir-other-control-flow-structures/</link>
		<comments>http://www.cogitolingua.net/blog/2012/02/01/the-good-ir-other-control-flow-structures/#comments</comments>
		<pubDate>Thu, 02 Feb 2012 02:39:58 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[design]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=1360</guid>
		<description><![CDATA[<p>In my last post on The Good IR, I had arrived at a representation with two invariants:</p> Only BasicBlock’s carry information about the ControlFlowGraph; No other edges are allowed between BasicBlocks. All BasicBlocks end with a control transfer Instruction. <p>I focused that post solely on the if-then-else control flow structure. I would now like to [...]]]></description>
			<content:encoded><![CDATA[<p>In my <a href="http://www.cogitolingua.net/blog//2012/01/27/the-good-ir/">last post</a> on The Good IR, I had arrived at a representation with two invariants:</p>
<ol>
<li>Only BasicBlock’s carry information about the ControlFlowGraph; No other edges are allowed between BasicBlocks.
<li>All BasicBlocks end with a control transfer Instruction.
</ol>
<p>I focused that post solely on the if-then-else control flow structure. I would now like to demonstrate how those invariants translate to the while-loop and if-without-else. Although other languages have do-while, for-loop, and switch-case, I won&#8217;t be addressing those now. Instead I&#8217;m limiting my focus on a much simpler language (that used by CS241 here at UCI). I want to get a clear picture of how the if-else and while-loop work, before complicating the picture with Return. So, for this post, I&#8217;m assuming that we have &#8216;the usual case&#8217; where the &#8216;stuff&#8217; inside the body of a loop or branch of an if-then doesn&#8217;t proceeds &#8216;normally&#8217; and doesn&#8217;t exit the function.</p>
<table>
<tr>
<th colspan=3>While Loop</th>
</tr>
<tr>
<td>
<pre>
while (cond) {
    // stuff
}
</pre>
</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/while-loop.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/while-loop-291x300.png" alt="" title="while-loop" width="291" height="300" class="alignnone size-medium wp-image-1401" /></a>
</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/02/while-loop-degenerate.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/02/while-loop-degenerate-171x300.png" alt="" title="while-loop-degenerate" width="171" height="300" class="alignright size-medium wp-image-1414" /></a>
</td>
</tr>
</table>
<p>We should not two important aspect of the illustration:</p>
<ol>
<li>The body of the loop is fully general. That is, I wanted to make clear that the &#8216;stuff&#8217; in the loop body might have control structures which cause a separation of the first block of the body from the last block. This separation is indicated by a series of small orange blocks.</li>
<li>Not in the illustration is a restriction on the instructions in the loop header. During a parse, when the loop is encountered we must create a new block to store two items: The instructions for evaluating the loop condition and the ConditionalBranch. We do <em>not</em> include instructions for any statement prior to the loop condition, even if such statement appear in the source program. We do this because, we wish to evaluate <em>only</em> the conditional and <em>no other</em> instructions, when control is restored on the loop&#8217;s backward control flow edge.</li>
</ol>
<p>Given these observations. The presence of UnconditionalJump and ConditionalBranch is straightforward, and needs no modification from the last post.</p>
<p>Consider the degenerate case, when the loop body is empty. We could choose to draw the control flow graph omitting the body node. We decide <em>not</em> to do so, because it creates a special case. Instead we prefer that the loop body contains <em>at least one</em> block, even if that block holds only an UncoditionalJump back to the loop header block.</p>
<p>Forcing all loops to have at minimum 3 nodes: the loop-header, the body, and the exit block; gives us a stable footing for control flow optimizations which are implemented later in the compiler. It also gives us another invariant: <b>No block points back to itself</b>, all blocks always point to other blocks.</p>
<table>
<tr>
<th colspan=3>If-NoElse</th>
</tr>
<tr>
<td>
<pre>
if (cond) {
     // stuff
}
</pre>
</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/02/if-no-else.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/02/if-no-else-221x300.png" alt="" title="if-no-else" width="221" height="300" class="alignnone size-medium wp-image-1418" /></a>
</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/02/if-no-else-degenerate.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/02/if-no-else-degenerate-175x300.png" alt="" title="if-no-else-degenerate" width="175" height="300" class="alignnone size-medium wp-image-1420" /></a>
</td>
</tr>
</table>
<p>The if-noelse is handled similarly. We do not have to create any instructions other than ConditionalBranch and UnconditionalJump. Again, we prefer to keep that &#8216;empty&#8217; else branch. Regularizing the if-then-else to a minimum of 4 blocks: the if-header, then-block, else-block, and join-block. Always creating these blocks definitely helps the control flow graph keep an organized and uniform appearance, especially when you consider the nesting of several different control flow structures. An <a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/Nested-If.png">example from the last post</a> would be quite unmanageable without the presence of these &#8216;empty&#8217; blocks. In the if-then-else control structure, keeping these blocks aids in the placement of SSA &phi;-instructions, as well as control flow graph traversals.</p>
<p><b>Design Rule of Thumb:</b> Don&#8217;t have the parser optimize away empty blocks, they&#8217;re actually useful to keep around (both for SSA and CFG traversals).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2012/02/01/the-good-ir-other-control-flow-structures/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Transforming Heuristics</title>
		<link>http://www.cogitolingua.net/blog/2012/02/01/transforming-heuristics/</link>
		<comments>http://www.cogitolingua.net/blog/2012/02/01/transforming-heuristics/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 21:51:53 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[Ideas]]></category>
		<category><![CDATA[Math]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=1406</guid>
		<description><![CDATA[<p>Many of the real problems in the world are NP. Things like Scheduling, Register Allocation, Routing packages, etc. In solving these really hard problems, we invent heuristics. Typically such heuristics are specific to the problem domain. For example, UPS might exploit certain characteristic about the geographical layout of the country; they face a certain subset [...]]]></description>
			<content:encoded><![CDATA[<p>Many of the real problems in the world are NP. Things like Scheduling, Register Allocation, Routing packages, etc. In solving these really hard problems, we invent heuristics. Typically such heuristics are specific to the problem domain. For example, UPS might  exploit certain characteristic about the geographical layout of the country; they face a certain subset of all possible graphs, and can exploit those features.</p>
<p>But we know that the NP problems are all reducible to each other. So, don&#8217;t the heuristics transform as well?</p>
<p>That is, given a heuristic that works well for the Knapsack problem, how well does that same heuristic (transformed) work on the Travelling Salesman Problem?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2012/02/01/transforming-heuristics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Good IR</title>
		<link>http://www.cogitolingua.net/blog/2012/01/27/the-good-ir/</link>
		<comments>http://www.cogitolingua.net/blog/2012/01/27/the-good-ir/#comments</comments>
		<pubDate>Fri, 27 Jan 2012 09:13:19 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[design]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=1278</guid>
		<description><![CDATA[<p>After parsing, each function can be represented as a ControlFlowGraph of BasicBlocks. Each BasicBlock holds a list of Instructions. For illustrative purposes Instructions are colored Blue and BasicBlocks are colored Orange.</p> <p>I&#8217;d now like to address the question: What should the ConditionalBranch instruction at the end of the If-Header point to?</p> Option Illustration Pro Con [...]]]></description>
			<content:encoded><![CDATA[<p>After parsing, each function can be represented as a ControlFlowGraph of BasicBlocks. Each BasicBlock holds a list of Instructions.  For illustrative purposes Instructions are colored Blue and BasicBlocks are colored Orange.</p>
<p>I&#8217;d now like to address the question: What should the ConditionalBranch instruction at the end of the If-Header point to?</p>
<table>
<tr>
<th>Option</th>
<th>Illustration</th>
<th>Pro</th>
<th>Con</th>
</tr>
<tr>
<td>Point at the following BasicBlock</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/CFG_option3.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/CFG_option3.png" alt="" title="CFG_option3" width="181" height="200" class="alignnone size-full wp-image-1297" /></a>
</td>
<td>
<ol>
<li>Convenient: the target BasicBlock is available during parse.
</ol>
</td>
<td>
<ol>
<li>Redundant: the If-Header block has a ControlFlowGraph edge with the same destination.</li>
<li>Non-Uniform: it&#8217;s unsettling to have some instructions (such as Add/Sub/Mul/Div) point at other Instructions, but then deal with ConditionalBranch as a special case.</li>
</ol>
</td>
</tr>
<tr>
<td>Point at the first Instruction of the following BasicBlock.</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/CFG_option2.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/CFG_option2.png" alt="" title="CFG_option2" width="181" height="200" class="alignnone size-full wp-image-1296" /></a>
</td>
<td>
<ol>
<li>Uniform: Instructions point only to other Instructions.</li>
</ol>
<td>
<ol>
<li>Redundant: The If-Header block has a ControlFlowGraph edge with the same destination.</li>
<li>Problematic: Can you guarantee that target all blocks have at least one instruction to point at?</li>
</ol>
</td>
</tr>
</table>
<p>The common drawback with both of these proposals, is that of Redundancy. At some point, during a future optimization pass, we might like to transform the ControlFlowGraph. If such a transformation requires that we update both the BasicBlock&rarr;BasicBlock edges <em>and</em> the target of ConditionalBranch Instructions, then we are more likely to have a bug. We must put forth extra effort in keeping both kinds of link synchronized.</p>
<p>We can avoid this common drawback by adding a layer of indirection/abstraction. Instead of forcing the ConditionalBranch instruction to maintain a pointer to its target, we can access the target through a function. That function can query its own BasicBlock about the outgoing ControlFlowGraph edges, and return either the target BasicBlock or the target BasicBlock&#8217;s first Instruction. By making this abstraction, not only do we avoid the extra effort of keeping redundant links synchronized, but we also promote uniformity in the design through the introduction of a new invariant: <b>Only BasicBlock&#8217;s carry information about the ControlFlowGraph; No other edges are allowed between BasicBlocks</b>.</p>
<table>
<tr>
<th>Illustration</th>
<th>Option</th>
<th>Pro</th>
<th>Con</th>
</tr>
<tr>
<td>ConditionalBranch queries its BasicBlock.</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/CFG_option1.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/CFG_option1.png" alt="" title="CFG_option1" width="181" height="200" class="alignnone size-full wp-image-1295" /></a>
</td>
<td>
<ol>
<li>Uniform: The only links across/between BasicBlocks are control flow edges held by BasicBlocks.</li>
<li>Convenient: Do not have to synchronize redundant information.
</ol>
</td>
<td>
<ol>
<li>Performance: A function call is required when asking a ConditionalBranch for its targets.
</ol>
</td>
</tr>
</table>
<p>Now that we have decided that ConditionalBranch will return information about the targets via a function call, we visit the question: Is the branch target the successor BasicBlock or is it the successor BasicBlock&#8217;s first Instruction?</p>
<p>To answer this question, I&#8217;m going to assume that the Internal Representation can also be interpreted. Such a design has the advantage that we can verify the structure of the IR by interpreting it after construction and between transformation/optimization passes.</p>
<p>Let&#8217;s assume a simple interpreter of the form:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;">Instruction i <span style="color: #000080;">=</span> function.<span style="color: #007788;">firstInstruction</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span>
<span style="color: #0000ff;">while</span> <span style="color: #008000;">&#40;</span><span style="color: #000040;">!</span>i.<span style="color: #007788;">isEnd</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span>
    i <span style="color: #000080;">=</span> i.<span style="color: #007788;">evaluate</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>This interpreter exploits polymorphic dispatch on the Instruction hierarchy. We can express the loop concisely because we follow the invariant: <code>evaluate</code> always returns the next instruction to be executed. This is analogous to setting the program counter (<code>pc</code>), but avoids passing the <code>pc</code> as either a function parameter (which most <code>evaluate</code> implementations will ignore) or as a global variable (yuck!). Importantly, this loop has no concept of BasicBlock&#8217;s; it evaluates only Instructions.</p>
<p>So, in the interests of keeping the interpreter loop immaculately clean, we have only one choice: ConditionalBranch must return the first Instruction of the target BasicBlock. But, this answer contains a potential pitfall: Can we guarantee that all target BasicBlocks have at least one Instruction?</p>
<p>Consider the following example code, and associated ControlFlowGraph.</p>
<table>
<tr>
<th>Code</th>
<th>ControlFlowGraph</th>
</tr>
<tr>
<td>
<pre>
if (cond1) {
    // do something 1
} else {
    if (cond2) {
        // do something 2
    }
}
</pre>
</td>
<td>
<a href="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/Nested-If.png"><img src="http://www.cogitolingua.net/blog/wp-content/uploads/2012/01/Nested-If-300x233.png" alt="" title="Nested-If" width="300" height="233" class="alignnone size-medium wp-image-1328" /></a>
</td>
</table>
<p>Notice that, those blocks which do not have instructions in them have been left blank. Suppose that <code>cond1</code> is <code>false</code> and <code>cond2</code> is <code>false</code> so that the interpreter ends up following the chain of empty BasicBlocks. We notice that there is immediately some difficulty in having the interpreter transfer from <code>ConditionalBranch(cond2)</code> to the the first instruction of the (empty) target BasicBlock. One possible solution comes to mind: Have <code>ConditionalBranch(cond2)</code> attempt to iterate the following blocks until it finds one with a first Instruction. Although this will work, it feels rather kludgy, and we should continue our search for a clean design.</p>
<p>Let&#8217;s analyze the situation in more detail. Specifically, let&#8217;s look at the last instruction of the ThenBlock. It contains code for <code>do something 1</code>, which has been left unspecified to emphasize the fact that it can be completely arbitrary. However, (because we are gifted with knowledge of assembly) we have advanced some foresight: when the instructions are finally emitted from the CodeGenerator in a linear stream, we must insert an UnconditionalJump which bypasses the code for the ElseBranch (assuming one exists), and lands the program counter at the first Instruction of the JoinBlock.</p>
<p>In our IR interpreter, the last Instruction of the code inside the ThenBlock can be arbitrary, so it will have some difficultly detecting that control should be transferred to the following JoinBlock. We can alleviate this difficulty by analogy to our foresight, and introduce an UnconditionalJump in the last BasicBlock of the ThenBranch. As with the ConditionalBranch, the UnconditionalJump will rely on the ControlFlowGraph edge (we are guaranteed only 1) of its BasicBlock to determine the following instruction during interpretation.</p>
<p>We now have two situations of BasicBlocks which end in a kind of control transfer Instruction:</p>
<ol>
<li>If-Headers and Loop-Headers, which end in a ConditionalBranch Instruction.</li>
<li>The Last BasicBlock of a ThenBranch, which ends in an UnconditionalJump.</li>
</ol>
<p>The explicit control transfer in these situations allows the interpreter to easily determine the next instruction, even though it lies in a different BasicBlock. That is, in these two cases, the interpreter does not need to know about the existence of BasicBlocks. Due to this advantage, we should try to arrange one of these two situations to our current concern: the empty BasicBlocks in the ElseBranch.</p>
<p>Only one of the two previous instructions applies: the UnconditionalJump. It certainly doesn&#8217;t hurt the semantics of the ControlFlowGraph to insert an UnconditionalJump at every edge (even fallthrough edges). So we can safely coin the invariant: <b>All BasicBlocks end with a control transfer Instruction</b> (ConditionalBranch or UnconditionalJump). This invariant unifies our design, and allows the interpreter to iterate only over Instructions.</p>
<p>Additionally, we now also have at least one Instruction (the UnconditionalJump) in every BasicBlock! So, we can positively guarantee that a ConditionalBranch is able to return the first Instruction of its target BasicBlock. Indeed, <em>all</em> control transfer Instructions are able to do likewise.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2012/01/27/the-good-ir/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Project Course in Web Services</title>
		<link>http://www.cogitolingua.net/blog/2012/01/23/project-course-in-web-services/</link>
		<comments>http://www.cogitolingua.net/blog/2012/01/23/project-course-in-web-services/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 04:10:29 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[Education]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=1260</guid>
		<description><![CDATA[<p>I&#8217;ve just finished reading Phillip Greenspun&#8217;s experience report, Teaching Software Engineering, which details a project course in building Web Services. Even though I personally, hate the Web&#8217;s architecture (but that&#8217;s a rant for some other time), it still remains as THE most influentential and convenient place to showcase one&#8217;s work. It&#8217;s also convenient for shopping, [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve just finished reading Phillip Greenspun&#8217;s experience report, <a href="http://philip.greenspun.com/teaching/teaching-software-engineering">Teaching Software Engineering</a>, which details a project course in building Web Services. Even though I personally, hate the Web&#8217;s architecture (but that&#8217;s a rant for some other time), it still remains as THE most influentential and convenient place to showcase one&#8217;s work.  It&#8217;s also convenient for shopping, learning, participating in niche communities, etc.  The Web has real business value, and is therefore un-ignorable.</p>
<p>Based on his experience with the course, Greenspun had some nice quotes, which pertain to thought&#8217;s I&#8217;ve been building recently as I&#8217;ve been digging into the question: &#8216;What makes education valuable?&#8217;.</p>
<h3>Building Real-World Skill</h3>
<blockquote><p>
We&#8217;d like our students to be able to take vague and ambitious specifications and turn them into a system design that can be built and launched within a few months, with the most important-to-users and easy-to-develop features built first and the difficult bells and whistles deferred to a second version. We&#8217;d like our students to know how to test prototypes with end-users and refine their application design once or twice within even a three-month project.</p>
<p>For every project in 6.916 Classic, we insisted on having a client. This is a person who can describe desired capabilities for an information system but offers no hint as to how to build it. The best clients are people who are in fact passionate about some sort of Internet service and completely clueless about all matters technical. Good sources of clients are dotcom CEOs, MBA students, non-profit organization directors, and university administrators.</p>
<p>we invited alumni who were working as professional software engineers to return to campus on Tuesdays and Thursday evenings to coach students during the 6 hours of supervised laboratory time per week. There are perhaps 10 alumni out there for every current MIT undergraduate.</p>
<p>The best projects were ones with clients who had the wherewithal to extend and maintain the service after the course is over, possibly by hiring the students who built it.
</p></blockquote>
<p>A plethora of useful circumstances are brought into alignment: Students are challenged in the same underspecified way that they&#8217;d face in a real job. That challenge is met by performing fast, iterative development, ala XP or Agile. They get to interact with an actual customer: deriving project worth from satisifing the client, and building experience with client rejection and other realistic bumps.  Finally, they get to forge connections in a business network, which can help them when entering the job market.</p>
<h3>Student Learning</h3>
<blockquote><p>
Software engineering is a craft and can only be learned by practice.</p>
<p>Our experience [with producing five complete internet service projects] contrasts with typical software engineering courses in which a student builds only one application (or a piece of one application) during the entire semester. Research on simple word association tasks has demonstrated that people who learned to perform quickly but not accurately would have remarkably good recall even months later and, with a bit of practice, could always be made to perform accurately. Whereas people who were slow but accurate forgot all of their skills within a month or two.
</p></blockquote>
<p>Just another example where <a href="http://www.codinghorror.com/blog/2008/08/quantity-always-trumps-quality.html">Quantity trumps Quality</a>.  People learn by actually practicing and experimenting. They do not learn by listening to lectures. Learning can be reinforced by discussion and analysis, by questioning and tweaking.</p>
<h3>Building a Portfolio</h3>
<blockquote><p>
At the end of the semester, a student in 6.916 could look back upon four or five completed Internet services. The first ones that he or she built had been done for the problem sets. They won&#8217;t have been complex. They may not have been built to a very high standard of polish. But their existence enabled nearly all students to become fluent in the arts of designing a data model, specifying a page flow, and implementing the designed system in SQL and a procedural language.</p>
<p>At the end of the semester we drill into the students&#8217; heads the cold hard facts of the world: nobody owes them attention. We have each student group prepare an overview page that is a single HTML document, with a few screen shots, that demonstrates the major functions of the Internet service that they&#8217;ve built. Visit <a href="http://philip.greenspun.com/seia/gallery/">http://philip.greenspun.com/seia/gallery/</a> to see these pages.</p>
<p>Finally, it has been fun to watch our students graduate and go onto the job market. During job interviews they are able to point their interviewer to the URL of the running Web service that they developed during 6.916. Oftentimes, the student-built service is more sophisticated and is running on a more reliable infrastructure than most of the Internet applications launched on the public Internet by the interviewer&#8217;s company!
</p></blockquote>
<p>This is where I&#8217;ll have to distinguish my school: The lessons are online, but the workshop let&#8217;s you build your <a href="http://www.codinghorror.com/blog/2004/10/a-programmers-portfolio.html">Programmer&#8217;s Portfolio</a>.</p>
<h3>Reaching for the Sky</h3>
<blockquote><p>
Universities have long taught theoretical methods for dealing with concurrency and transactions. The Internet raises new challenges in these areas. A dozen users may simultaneously ask for the same airline seat. Twenty responses to a discussion forum question may come in simultaneously. The radio or hardwired connection to a user may be interrupted halfway through an attempt to register at a site.</p>
<p>In the second problem set (&#8220;reservation system&#8221;), students built a collaborative conference room scheduling system. This raises the problem of concurrency in a natural manner. Every student can understand that you don&#8217;t want to book two people into a room at overlapping times.</p>
<p>Third, because all of the projects have a predictable shape we&#8217;ll be able to introduce distributed computing challenges merely by having students offer services to each other.
</p></blockquote>
<blockquote><p>
Students said that the &#8220;metadata&#8221; problem set was very valuable for speeding work on their projects. Students were asked to build a knowledge management system by writing a computer program to write all of the computer programs. I.e., we gave them a machine-readable language for representing the system capabilities and user experience and asked them to write a program to generate the SQL data model and then the scripts to support the user experience.
</p></blockquote>
<blockquote><p>
In the final exercise of the problem set, we ask the students to mark certain rooms as requiring fees. Users who wish to book those rooms must supply a credit card number. At MIT we hook up the servers to a live merchant account at CyberCash. Thus our better students will be able to open their credit card statements in the middle of the semester and discover a few dollars in charges made by their own Web server.
</p></blockquote>
<p>UPDATE: Greenspun also has a <a href="http://blogs.law.harvard.edu/philg/2007/08/23/improving-undergraduate-computer-science-education/">quick bullet list</a> of all the lessons learned, and outline of the course&#8217;s structure.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2012/01/23/project-course-in-web-services/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Separating the Wheat from the Chaff</title>
		<link>http://www.cogitolingua.net/blog/2012/01/11/separating-the-wheat-from-the-chaff/</link>
		<comments>http://www.cogitolingua.net/blog/2012/01/11/separating-the-wheat-from-the-chaff/#comments</comments>
		<pubDate>Thu, 12 Jan 2012 00:54:23 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[Education]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=1186</guid>
		<description><![CDATA[<p>From Coding Horror: Separating Programming Sheep from Non-Programming Goats I learned of a paper, The camel has two humps, which describes a test that allows teachers to differentiate students likely to do well studying computer science from those who will likely never &#8216;get it&#8217;.</p> <p>This paper sounds awfully similar to the physics conceptual test mentioned [...]]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://www.codinghorror.com/blog/2006/07/separating-programming-sheep-from-non-programming-goats.html">Coding Horror: Separating Programming Sheep from Non-Programming Goats</a> I learned of a paper, <a href="http://www.cs.mdx.ac.uk/research/PhDArea/saeed/">The camel has two humps</a>, which describes a test that allows teachers to differentiate students likely to do well studying computer science from those who will likely never &#8216;get it&#8217;.</p>
<p>This paper sounds awfully similar to the physics conceptual test mentioned on American RadioWorks program, <a href="http://americanradioworks.publicradio.org/features/tomorrows-college/lectures/">Don&#8217;t Lecture Me</a>, that uses questions with little to no arithmetic to probe students for their level of understanding. The physicsts don&#8217;t use their test to cull the herd, but rather to let the professor assess which concepts the students haven&#8217;t yet assimilated. Astoundingly, the computer science paper proclaims:</p>
<blockquote><p>
We point out that programming teaching is useless for those who are bound to fail and pointless for those who are certain to succeed.
</p></blockquote>
<p>Evidence for this statment comes from low retention rates among computer science departments (30&#8211;60% fail the first programming course). But that observation doesn&#8217;t necessarily mean the students are &#8216;bound to fail&#8217;. Rather, I see it as evidence that the current methodology, lecturing, has not only failed to transmit information to the students, but also leads to such criminally dismal expectations.</p>
<p>One of the more promising experiences related in the paper demonstrates that we have to match education techniques to our learning experience. We should only use those tools which leverage how the brain learns.</p>
<blockquote><p>
Programming teachers, being programmers and therefore formalists, are particularly prone to the ‘deductive fallacy’, the notion that there is a rational way in which knowledge can be laid out, through which students should be led step-by-step. One of us even wrote a book [8] which attempted to teach programming via formal reasoning. Expert programmers can justify their programs, he argued, so let’s teach novices to do the same! The novices protested that they didn’t know what counted as a justification, and Bornat was pushed further and further into formal reasoning. After seventeen years or so of futile effort, he was set free by a casual remark of Thomas Green’s, who observed “people don’t learn like that”, introducing him to the notion of inductive, exploratory learning.
</p></blockquote>
<p>Programmers, both professional and beginner, spend much time dubugging. Much language research is therefore devoted to finding and building systems that either (a) help to prevent the writing of bugs (ex: static type systems) or (b) help to discover them once written. As far as that goes, it is interesting to note from this observation:</p>
<blockquote><p>
Thomas Green put forward the notion of <em>cognitive dimensions</em> to characterise programming languages and programming problems [12]. &#8230; He is able to measure the difficulty levels of different languages (some are much worse than others) and even of particular constructs in particular languages. If-then-else is good, for example, if you want to answer the question “what happened next?” but bad if the question is “why did that happen?”, whereas Dijkstra’s guarded commands are precisely vice-versa.
</p></blockquote>
<p>That one form of language construct, if-then-else, would be easy for a beginner to write in, but the other form, guarded commands, aids debugging.</p>
<p>The difficulty in learning a skill such as programming, lies principally in forming a <a href="http://www.amazon.com/Mental-Models-Cognitive-Science-Johnson-Laird/dp/0674568826"><em>mental model</em></a> of how the computer executes the program. Forming an accurate model implies both the ability to describe your problem as a program and the ability of expert programmers to justify their programs. This understanding of how comprehension works drives the construction of the conceptual test, both in physics and computer science.</p>
<p>The paper has a very important finding:</p>
<blockquote><p>
&#8230; in the first administration [students with no prior experience programming] they divided into three distinct groups with no overlap at all:</p>
<ul>
<li>44% used the same model for all, or almost all, of the questions. We call this the <em>consistent</em> group.</li>
<li>39% used different models for different questions. We call this the the <em>inconsistent</em> group.</li>
<li>The remaining 8% refused to answer all or almost all of the questions. We call this the <em>blank</em> group.</li>
</ul>
<p>&#8230;<br />
Remarkably, it is the consistent group, and almost exclusively the consistent group, that is successful.<br />
&#8230;<br />
It has taken us some time to dare to believe in our own results. It now seems to us, although we are aware that at this point we do not have sufficient data, and so it must remain a speculation, that what distinguishes the three groups in the first test is their different attitudes to meaninglessness. Formal logical proofs, and therefore programs – formal logical proofs that particular computations possible, expressed in a formal system called a programming language – are utterly meaningless. To write a computer program you have to come to terms with this, to accept that whatever you might want the program to mean, the machine will blindly follow its meaningless rules and come to some meaningless conclusion. In the test the consistent group showed a pre-acceptance of this fact: they are capable of seeing mathematical calculation problems in terms of rules, and can follow those rules wheresoever they may lead. The inconsistent group, on the other hand, looks for meaning where it is not. The blank group knows that it is looking at meaninglessness, and refuses to deal with it.
</p></blockquote>
<p>The test, which teases out these speculations, involves the notion of assignment. It iterates through different statement orderings, and variable names allowed to carry misleading implications about the underlying values. Many possible answers are given for each question, so as to arrive at the mental model that a student might have used. They also come with a space for free response or side-work. For example,</p>
<blockquote>
<pre>
1. Read the following statements and tick the correct answer in the front column.

int a = 10;
int b = 20;

a = b;

The new values of a and b are:

[ ] a = 30  b = 0
[ ] a = 30  b = 20
[ ] a = 20  b = 0
[ ] a = 20  b = 20
[ ] a = 10  b = 10
[ ] a = 10  b = 20
[ ] a = 20  b = 10
[ ] a = 0   b = 20
if none, give the correct values:
    a =        b =
</pre>
</blockquote>
<p>So the open question is: Is it possible for a teacher to help the <em>inconsistent</em> students learn a <em>consistent</em> mental model? If so, then we shouldn&#8217;t give up, we should bring the learning techniques which make it possible into the classroom. I conjecture that those techniques will also help the students with consistent models correct the bugs in their mental model faster. So dispensing with lecture, clearly shown not to work, and replacing it with newer, more effective techniques derived from cog sci, will benefit us all.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2012/01/11/separating-the-wheat-from-the-chaff/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Business as an Investment</title>
		<link>http://www.cogitolingua.net/blog/2011/12/29/business-as-an-investment/</link>
		<comments>http://www.cogitolingua.net/blog/2011/12/29/business-as-an-investment/#comments</comments>
		<pubDate>Thu, 29 Dec 2011 07:27:58 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Comp*]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Ideas]]></category>
		<category><![CDATA[Investment]]></category>
		<category><![CDATA[Tech*]]></category>
		<category><![CDATA[business]]></category>
		<category><![CDATA[start-up]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=1104</guid>
		<description><![CDATA[<p>I finished my reading of Mike Maloney&#8217;s Guide to Investing in Gold and Silver, partially to get an idea of how he got started in the business of bullion. He&#8217;s actually had several businesses throughout his life, including one where he designed &#8220;stereo amplification electronics were selected as one of five permanent exhibits at the [...]]]></description>
			<content:encoded><![CDATA[<p>I finished my reading of Mike Maloney&#8217;s Guide to Investing in Gold and Silver, partially to get an idea of how he got started in the business of bullion. He&#8217;s actually had several businesses throughout his life, including one where he designed &#8220;stereo amplification electronics were selected as one of five permanent exhibits at the royal Victoria &#038; Albert Museum in London&#8221;[<a href="http://wealthcycles.com/about/michael-maloney">WealthCycles</a>]. The last chapter contained what I was looking for. Starting with a goal to accumulate high-cash-flow apartments, he decided (based on research) to invest in the gold and silver cycle as it was building momentum (~2001). He also realized that further leverage could be obtained in the gold and sliver mining company stocks, and in starting a business that would do well during the cycle. Promoting the book, and joining Robert Kiyosaki&#8217;s team are icing on the cake.</p>
<p>Clearly, he didn&#8217;t position himself without some self-education. He had good reasons (stock market was languishing) to uncover information about the next cycle. As a practiced entrepreneur he knew both how to form and promote the new business: it was really only a question of figuring out which business would be the most profitable. He&#8217;s now quite passionate about the data he&#8217;s collected, and in helping others to profit from the information.</p>
<p>But you don&#8217;t achieve that kind of success without some up-front costs and research, together with the tenacity to carry through on the plan.</p>
<p>Recently, I&#8217;ve been trying to figure out what is the best manner in which I can use my existing capital (education about programming, dedication to reading/learning more, and passion for clearly explaining it to others) to build myself a stable future. On the one hand I could get a regular job either as a programmer at a large tech company (producing more for them that I receive in salary) or as an instructor a college/university (collecting considerably less in salary). But neither of these options gives me the autonomy I desire. Besides which, I think that Kahn Academy, has shown us that a revolution in education is afoot.</p>
<p>So, my current plan is to find a way of effectively educating people about programming: to provide them with the skills that allow them to join the class of highly compensated professional programmers. If I can uncover a mechanism that scales, so that revenues are less a function of the time I spend talking and more a function of the skills instilled in others: then I think I can build a stable, reliable income. The mechanism that scales well seems to be short self-contained videos about language features and design patterns accompanied by an XP workshop to build the interpersonal skills and practice.</p>
<p>What I learned from Mike: Building a business on the boom cycle leverages your gains.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2011/12/29/business-as-an-investment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Learning the Abstractions</title>
		<link>http://www.cogitolingua.net/blog/2011/11/15/learning-the-abstractions/</link>
		<comments>http://www.cogitolingua.net/blog/2011/11/15/learning-the-abstractions/#comments</comments>
		<pubDate>Tue, 15 Nov 2011 07:47:00 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Ideas]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=878</guid>
		<description><![CDATA[<p>Since much of programming is about creating and manipulating abstractions, it figures that a large amount of education is going to be about leaning those abstractions. Things like classic data structures, the useful sloppiness of O-notation for algorithm analysis, and Design Patterns. But how should we introduce these things to our students?</p> <p>Should you teach [...]]]></description>
			<content:encoded><![CDATA[<p>Since much of programming is about creating and manipulating abstractions, it figures that a large amount of education is going to be about leaning those abstractions. Things like classic data structures, the useful sloppiness of O-notation for algorithm analysis, and Design Patterns. But how should we introduce these things to our students?</p>
<p>Should you teach the abstraction first, because that&#8217;s what you wind up working with as a programmer? This risks disengagement and confusion. Without a clear concept of <em>what</em> is being abstracted, it&#8217;s hard to understand and pay attention. It&#8217;s difficult to see the benefit.</p>
<p>Should you teach the old methodology first? This risks building bad habits into the students. They typically feel resentful when you force them through muck and then reveal the pristine diamond-studded gold-brick road they could have used instead.</p>
<p>It&#8217;s important to know not only <em>what</em> is being abstracted, but also <em>how</em> and why.</p>
<p>Joel Spolsky, in his post, <a href="http://www.joelonsoftware.com/articles/LeakyAbstractions.html">The Law of Leaky Abstractions</a> notes (emphasis added):</p>
<blockquote><p>
The law of leaky abstractions means that whenever somebody comes up with a wizzy new &#8230; tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying &#8220;learn how to do it manually first, then use the wizzy tool to save time.&#8221; [...] tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. <b>So the abstractions save us time working, but they don&#8217;t save us time learning.</b><br />
And all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder.
</p></blockquote>
<p>Clearly, this indicates that the old methodology should be worked through first. And then students should, in a discussion section of some kind, discuss that methodology as a &#8216;code smell&#8217; working out (with some guidance) a better solution for themselves. By following this path you motivate the understand through a previously worked example (which can be referred to during the discussion) and can clearly identify the leaks as you work toward the abstraction. Any bad coding habits picked up during the initial slog can be eradicated by revisiting the assignment and applying the new abstraction. There shouldn&#8217;t be resentment, because it&#8217;s framed as a way of applying some cool new pattern just learned.</p>
<p>I wonder&#8230; I haven&#8217;t seen anybody teach virtual dispatch and dynamic polymorphism from first principles in this manner. Usually that abstraction is granted as a built-in of the language being learned. Would it be useful to hand-hold the students through designing a virtual method table? could you find a motivating enough example? First doing it some crufty way, then identifying the VMT as a pattern, and finish by refactoring the initial assignment.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2011/11/15/learning-the-abstractions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Object Oriented Compiler Architecture</title>
		<link>http://www.cogitolingua.net/blog/2011/11/12/object-oriented-compiler-architecture/</link>
		<comments>http://www.cogitolingua.net/blog/2011/11/12/object-oriented-compiler-architecture/#comments</comments>
		<pubDate>Sun, 13 Nov 2011 00:48:52 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[compiler design]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=842</guid>
		<description><![CDATA[<p>Out of some curiosity, I quickly read this paper today:</p> <p>The Object-Oriented Architecture of High-Performance Compilers, Lutz Hamel, Diane Meirowitz and Spiro Michaylov, Technical Report, Thinking Machines Corporation, 1996.</p> <p>Even though they assume that the compilers internal representation is some kind of AST, the underlying lessons are worth communicating, and are applicable even if your [...]]]></description>
			<content:encoded><![CDATA[<p>Out of some curiosity, I quickly read this paper today:</p>
<p><a href="http://homepage.cs.uri.edu/faculty/hamel/pubs/oo-compiler-arch.pdf">The Object-Oriented Architecture of High-Performance Compilers</a>, Lutz Hamel, Diane Meirowitz and Spiro Michaylov, Technical Report, Thinking Machines Corporation, 1996.</p>
<p>Even though they assume that the compilers internal representation is some kind of AST, the underlying lessons are worth communicating, and are applicable even if your compiler has a block or sea-of-nodes representation.</p>
<p>First, we should separate the data structure of the IR from the algorithms (optimization passes and code transforms) which manipulate the IR. The Visitor pattern allows us to do this in the most generic way possible. This allows the algorithms to differ in their order of traversal. The only constraint is that they must adhere to the IR&#8217;s API, which should be kept very small. Since learning the Visitor pattern, I&#8217;ve been thinking that the phases of a compiler make it a Meta-Visitor or a Visitor of Visitors. It&#8217;s nice to see that viewpoint is shared by others.</p>
<p>Second, attributes (such as use-def chains) can be added to the IR separately, so that they are not visible outside of the transformations which require them. They propose attaching attributes to IR nodes by a templatized AttributeManager class, backed by an associative array. I would prefer if the Node class itself was extended (via subclass or decorator) so that it had a (possibly templatized) <code>setAttr(...)</code> method. However, we should still keep it possible to add and remove attributes &#8220;on-the-fly&#8221; so that they do not stick around past their usefulness, and so that they may be created on demand. I&#8217;m not sure how best to handle updating the attributes when the IR is transformed (i.e. lifting code from a loop might change the nesting count attribute). Perhaps a call-back mechanism to update or just invalidate that attribute&#8217;s manager.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2011/11/12/object-oriented-compiler-architecture/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Strong Typing for Security</title>
		<link>http://www.cogitolingua.net/blog/2011/11/11/strong-typing-for-security/</link>
		<comments>http://www.cogitolingua.net/blog/2011/11/11/strong-typing-for-security/#comments</comments>
		<pubDate>Fri, 11 Nov 2011 08:13:11 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[Ideas]]></category>
		<category><![CDATA[Information Flow]]></category>
		<category><![CDATA[Language]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=831</guid>
		<description><![CDATA[<p>I got into a mild argument about static vs. dynamic typing. I recognize that static typing can be verbose to the point of being repetitious. Take Java generics for example:</p> List&#60;String&#62; astr = new ArrayList&#60;String&#62;&#40;&#41;; <p>There really isn&#8217;t a great reason why the compiler can&#8217;t infer the type of the variable on the right hand [...]]]></description>
			<content:encoded><![CDATA[<p>I got into a mild argument about static vs. dynamic typing. I recognize that static typing can be verbose to the point of being repetitious. Take Java generics for example:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">List<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span> astr <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ArrayList<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>There really isn&#8217;t a great reason why the compiler can&#8217;t infer the type of the variable on the right hand side of the assignment. C# already implements type inference for this case, and C++ is <a href="http://en.wikipedia.org/wiki/C%2B%2B11#Type_inference">adding it</a>. ML and Haskell are strongly typed and have practiced type-inference since their inception. So we should actually dismiss the verbosity objection to static typing right now, because it&#8217;s an artifact of implementation that the more popular languages, C++ and Java, represent really poor examples of what could otherwise be a really good thing. </p>
<p>In my opinion, a static typing system is actually a proof system over your code. We shouldn&#8217;t complain about having compiler errors, rather we should rejoice that the compiler is able to automatically detect cases where we were ambiguous or tried to do something ill-defined. We should try to write our code so that the compiler can tell us when we make a mistake. Really, we want to express as many constraints as possible, so that the machine can do more checking and we end up with less buggy code. Statically typing all our variables and expressing systematic constraints is an effort that pays off in spades for large code bases.</p>
<p>But couldn&#8217;t we all just use the more flexible dynamic typing languages, and catch the bugs with testing? In my opinion, no. Testing should be done anyway, but it isn&#8217;t enough to prove the absence of a bug. Only a proof checker, such as a static typing system, can come close to doing that. I think I can really drive this point home by examining web applications.</p>
<h4>The Problem</h4>
<p>Web applications are really glorified string processors. HTML requests come in as strings, and web pages are emitted as strings. JavaScript processes more strings in the page layout, potentially requesting even more information from the server in response to user-generated events. Forums, Social Networking, and other participatory applications allow for user generated content. This widespread and popular practice actually leaves our glorified string parser (web app) at risk: for, if we are not careful, a malicious user can supply a string which, if it appears in the &#8216;wrong&#8217; context, might be interpreted as legitimate JavaScript code by the application. That is, malicious users can execute arbitrary code, with the full rights and privileges as the application itself. This vulnerability is known as <a href="">Cross-Site Scripting (XSS)</a>.</p>
<p>So, we find ourselves writing a string processor which must deal with strings of various encodings, special characters, and escape conventions. Namely, HTML, JavaScript, XML, CSS, URL. If one of these strings (even from our own database) manages to arrive in a context without first going through a filter to sanitize it, then our application has a security vulnerability. Do you think that it&#8217;s possible to write test cases (or even auto-generate them) given all the code paths, all the different sources (user, cookie, url, database, etc) and all the contexts in which a string might appear. In my opinion, the exponential complexity makes testing an infeasible approach. What we really need, then, is a proof system to verify that no strings end up in the wrong context.</p>
<h4>Static Typing to the Rescue</h4>
<p>If we are willing to go back to our application and examine it in detail, we find that we should really be treating each of the above strings as different types. HtmlString should be a different type from JSString, which are again both different from UrlString. Simply expressing each context as a different type enables our static typing system to verify that we never use the wrong kind of string in the wrong context. We can also provide explicit conversion functions, which provide the proper escaping and sanitization when moving from one context to another.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">void</span> addToDocument<span style="color: #008000;">&#40;</span>HtmlString hStr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
HtmlString fromURL<span style="color: #008000;">&#40;</span>UrlString uStr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
UnsafeString HttpRequest<span style="color: #008000;">&#40;</span>UrlString uStr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></pre></div></div>

<h4>Language Support</h4>
<p>What&#8217;s most unfortunate about this approach is that neither C++ nor Java provide us with an easy way to distinguish two strings. We certainly don&#8217;t want to use C&#8217;s <code>typedef</code>, because that enables automatic coercion between the different kinds of string, which defeats the point. So, we&#8217;re forced into creating a separate class for each of these strings, including implementing all the operators that make for convenient string manipulation. I&#8217;d really love a language that would allow me to extend my existing string type without fully re-implementing everything, yet still be able to treat the extension as a completely different type.</p>
<h4>Conclusion</h4>
<p>Essentially we&#8217;re using the static typing as a proof system to constrain our programming practices. The static type verification provides a proof that we never use a string in the wrong context. In my opinion, this coding technique is of enormous benefit, and represents a use-case that dynamic typing + unit testing simply cannot approach.</p>
<p>The real trick is recognizing that two strings aren&#8217;t necessarily the same type.</p>
<p>Just for reference, I did not come up with this example myself.<br />
Joel Spolsky <a href="http://www.joelonsoftware.com/articles/Wrong.html">advocates</a> using Hungarian notation, which I think is too weak for solving security vulnerabilities.<br />
Tom Moertel provides an <a href="http://blog.moertel.com/articles/2006/10/18/a-type-based-solution-to-the-strings-problem">inplementation</a> of this approach in Haskell.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2011/11/11/strong-typing-for-security/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Unit Testing in Education</title>
		<link>http://www.cogitolingua.net/blog/2011/09/29/unit-testing-in-education/</link>
		<comments>http://www.cogitolingua.net/blog/2011/09/29/unit-testing-in-education/#comments</comments>
		<pubDate>Thu, 29 Sep 2011 08:11:57 +0000</pubDate>
		<dc:creator>erich</dc:creator>
				<category><![CDATA[Comp*]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Ideas]]></category>

		<guid isPermaLink="false">http://www.cogitolingua.net/blog/?p=784</guid>
		<description><![CDATA[<p>In my readings of Extreme Programming, one thing I&#8217;m struggling to pick up on is Unit Testing. Mostly, it&#8217;s the pain of writing tests for each thing I can imagine goes wrong. Partly it&#8217;s an imaginary horror: I think about the full extent of testing all at once. I think about exhaustively testing my program&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>In my readings of Extreme Programming, one thing I&#8217;m struggling to pick up on is Unit Testing. Mostly, it&#8217;s the pain of writing tests for each thing I can imagine goes wrong. Partly it&#8217;s an imaginary horror: I think about the full extent of testing all at once. I think about exhaustively testing my program&#8217;s behavior. But in reality, writing the tests is not that bad. You don&#8217;t sit down and write a hundred unit tests for every little thing you can think of. Rather, you test (and program) incrementally. So, my overactive paranoia about what I have to test doesn&#8217;t represent the reality of testing. Instead of imagining everything at one, I&#8217;ll have to train myself to focus on the small, achievable, easily testable things.</p>
<p>Once I get over that, and actually start to write my tests, I&#8217;ve noticed some other effects. Writing the tests before the code, actually gives you a good idea of how I&#8217;ll want to interact with the code. The very act of hypothetically using it gives me a more concrete picture of what I want to accomplish. The burden of writing the test helps to keep the complexity down. You end up with a better mental model, cleaner architecture, and fewer dependencies.</p>
<p>Ok. Great. So unit testing my code has nice benefits. But can it be applied elsewhere? Say, in education?</p>
<p>Well certainly you could <em>teach</em> unit testing philosophy. You could drill into your students the virtues of unit testing their code. But that&#8217;s not what I mean by <em>applying</em> unit testing to education. I mean, that we should look at the education process itself, and see where unit testing fits.</p>
<p>Most classes consist of some mix of the following:</p>
<ul>
<li>Lecture and Discussion.</li>
<li>Exams. Quizzes, Midterms, and Final.</li>
<li>Assignments. These can be Projects (individual and group), Research Papers, Homework drills, etc. Work that the students are responsible for doing, and which they hand in for a grade.</li>
</ul>
<p>Looking at these components, we should ask ourselves: How effective is each piece in contributing to the total outcome? How does each add to a students memory and experience? How well does each help the student learn? When we follow these questions an interesting picture begins to emerge.</p>
<p>The purpose of lecture is typically used to demonstrate some of the concepts through pretty pictures, compelling narrative, and highlighting of important things to remember. Discussion sections are typically used to get the students somewhat involved with the material. That is, we expect active participation in discussion vs passive participation in lecture. Assignments are typically used to give the students practice with the material. They are contrived to highlight certain aspects of the field, or certain skill sets that have been found useful in actual research. They are a way to artificially build experience.</p>
<p>Exams and quizzes stand apart in that they don&#8217;t necessarily contribute to student learning. Rather, they are used as a metric for assessing how much knowledge the student has absorbed, or how well they are able to assemble that knowledge for solving a problem. These devices are pedagogical unit tests. But, unlike unit tests I write for my code, they are exercised last, not first. They aren&#8217;t exercised often, and they don&#8217;t really follow iterative development.</p>
<p>So, now that we&#8217;ve identified a component of our educational system that&#8217;s analogous to the extreme programming methodology, how do we improve our classroom techniques to actually <em>use</em> the same methodology we want to teach?</p>
<p>Maybe the tests should come first. Instead of having final exams as an exit criteria, we should have entrance exams as a placement/enrollment criteria. I&#8217;m not quite sure how well this would work, but it should improve the relationship between teacher and student. No longer are the teacher and student in adversarial positions about the grading outcome.</p>
<p>Quizzes are a great way to measure incremental progress, and many instructors already use them as an appropriate feedback mechanism about the material. In fact, a first-day quiz about &#8216;what do you expect to get from this course&#8217; is often a great way for an instructor to customize the material to student learning objectives. Just as programmers use unit tests to pinpoint software errors, we can use quizzes to pinpoint conceptual errors where they occur.</p>
<p>The XP methodology seems to suggest that a comprehensive test is not the way to go. Rather, it&#8217;s better to have a greater number of smaller quizzes which test specific items. It also suggest running tests repeatedly. For humans, too much repetition can become monotonous, but just enough definitely improves retention. We also want to make sure that the most recent material learned doesn&#8217;t supplant material covered at the beginning of the course.</p>
<p>Finally, and most radically, XP suggests writing your own tests. I think it would be a great world if students became responsible for their own learning outcome. And what better way to motivate them, than by feeding back their own &#8216;what if&#8217;s on the tests? Coming up with good questions is actually very difficult, and provides an altogether new way of viewing the material. It forces a re-conceptualization of what was learned, providing yet another avenue of engagement. I think it will also viscerally demonstrate that the students aren&#8217;t helpless sheep waiting to be watered with a font of knowledge from the professor. IF we can find a way to incorporate this radical step, I think it would help to form a cycle of learning (ask questions, analyze answers, repeat!) and give our students practice with the self-reliance they will need in the real world.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cogitolingua.net/blog/2011/09/29/unit-testing-in-education/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

