


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Final//EN">
    
<HTML>

 <HEAD>
 <link REL="STYLESHEET" TYPE="text/css" HREF="index.css">
 <TITLE>C to English to C in Perl </TITLE>
 </HEAD>

 <BODY BGCOLOR = "#006600"
       TEXT    = "#FFFFFF" 
       VLINK   = "yellow" 
       LINK    = "#FFFFFF" >

<h1><a name="top">C to English to C in Perl</a></h1>
 

<table width="99%" cellpadding=2 cellspacing=0 align="top"
   border=0  >
 <tr>
  <td valign="top" bgcolor="#22bb22" >
     
<table height="99%" width="50%"     
    cellpadding=0 cellspacing=0 align="top"
    border="0" >


<TR>
<td valign="left" colspan="2" >
<a href="faq.html" >
FAQ </a>
</td>


    </tr>

<TR>
<td valign="left" colspan="2" >
<a href="perl.html" >
Perl stuff </a>
</td>

</tr>
<tr><td valign="top" >
<img src="graphics/tablearrow3.gif"
height="25" width="25">	
</td>
<td valign="top">
		
<table height="99%" width="99%"     
    cellpadding=0 cellspacing=0 align="top"
    border="0" >


<TR>
<td valign="left"  >
<a href="mason.html" >
Mason stuff </a>
</td>


    </tr>

<TR>
<td valign="left"  >
<a href="tk.html" >
Tk stuff </a>
</td>


    </tr>

<TR>
<td valign="left"  >
<img align="right" src="graphics/stara.gif" alt="*">
<a href="RecDescent.html" >
RecDescent<BR>stuff </a>
</td>


    </tr>

</table>






		</td>

    </tr>

<TR>
<td valign="left" colspan="2" >
<a href="cstuff.html" >
C stuff </a>
</td>


    </tr>

<TR>
<td valign="left" colspan="2" >
<a href="randomness.html" >
Randomness </a>
</td>


    </tr>

<TR>
<td valign="left" colspan="2" >
<a href="punditry.html" >
Punditry </a>
</td>


    </tr>

<TR>
<td valign="left" colspan="2" >
<a href="links.html" >
Links </a>
</td>


    </tr>

</table>





 
  </td>
  <td bgcolor="#006600"  
valign="top">


<p>
C2Eng2C, a.k.a. DeCSS and ReCSS, a.k.a 
the soon to come First Amendment File System,
is available from these parts.
A few reviews:<br>
<blockquote>
It ain't Hemingway, but at least it's pronounceable.
</blockquote>
--Tim O'Reilly.
<blockquote>
Wow.  It's like "What if ANSI C was designed by the COBOL committee?".
</blockquote>
--Nathan Torkington.
<br>
So, you've heard of what I'm doing.
You'd like to give it a go.
Excellent. 
Obtaining and using Decss/Recss:
</p>
<p>

<a href="http://www.mit.edu/~ocschwar/decss2.pl">
http://www.mit.edu/~ocschwar/decss2.pl
</a><br>
also available as 
<a href="http://www.mit.edu/~ocschwar/c2eng">
http://www.mit.edu/~ocschwar/c2eng
</a><br>
<a href="http://www.mit.edu/~ocschwar/recss.pl">
http://www.mit.edu/~ocschwar/recss.pl
</a><br>
also available as 
<a href="http://www.mit.edu/~ocschwar/eng2c">
http://www.mit.edu/~ocschwar/eng2c
</a><br>
<a href="http://www.mit.edu/~ocschwar/RCS/">
Here is the RCS archive for the scripts. Serious
users will want a look at it.</a>
</p>

<p>
Also, you will need to go to CPAN for
<a href="http://www.cpan.org/modules/by-authors/Damian_Conway/Parse-RecDescent-1.79.tar.gz">
Parse::RecDescent</a> and 
<a href="http://search.cpan.org/search?dist=Lingua-EN-Numbers-Ordinate">
Lingua::EN::Numbers::Ordinate</a>.
You will need Perl 5.005 or better.
Or, if you just want to see what it does, take a look
at <a href="http://www.mit.edu/~ocschwar/demunck.c">
demunck.c</a> and 
<a href="http://www.mit.edu/~ocschwar/demunck.eng.fmt">
demunck.eng.</a>

</p>
<h2> What is this all about? </h2>

<p>
<a href="http://www.cs.cmu.edu/~dst/DeCSS/">
Oh, boy, where would I begin?</a> David Touretzky
explains it better. Also, so does 
<a href="http://www.2600.com">
Emannuel Goldstein.</a> Basically, an open
source piece of computer source code is about
to be censored off the net, and I had to do something
about it. <a href="http://web.mit.edu/ocschwar/Toolpit/expression.html">
I also wrote an essay on the topic.</a> 
<a href="http://www.eff.org/pub/Privacy/ITAR_export/Bernstein_case/Legal/960726_filing/HTML/abelson_decl.html">
Hal Abelson also gave the issue some attention.</a>
I wrote an article for <a href="http://www.tpj.com">The Perl Journal</a>
about this, and an elaboration on how I did all
this is to be <a href="http://www.mit.edu/~ocschwar/c2eng.html">
found here.</a>
 </p>

<p>
c2eng is not the first program to do this.
For the Bernstein case, regarding encryption source
code and export restrictions thereon, 
<a href="http://personal.sip.fi/~lm/c2txt2c/">
a program called c2txt2c</a> was written  
by Leevi Marttila using Bison and flex.
I chose to write mine from scratch because
1. when I started out, c2txt2c came with a disclaimer
that it was only working for the Blowfish source code,
2. c2txt2c produces Dadaist sentences and is thus
in my view too facetious to persuade mundanes
with gavels, and 3. I don't know Bison.
</p>
<p>Jonathan Baccash, of Princeton University, wrote another
C to English demonstration, using SML/NJ. His style in the
translation is better, but he doesn't try CPP directives.
In the future I aim to write a new version of c2eng 
with some of his style incorporated in.
</p>

<h2>

Notes regarding the use of Eng2c and C2eng:
</h2>
<p>
-1. Observe the irregular hashbangs. They're
what I have to use.
</p>

<p>
0. Both scripts just dump their stuff to STDOUT.
The way to use them is to do 
<i>c2eng  foo.c > bar.eng</i>
and 
<i>eng2c foo.eng > foo.eng.c </i>
</p>

<p>
1. Both spew huge amounts of something into
STDERR. Direct STDERR somewhere other than
the output you want. I needed STDERR
for debugging info. You may find it an interesting
marker for the script's progress.
</p>
<p>

2. C2eng's output is not formatted much at all.
Luckily, we have the fmt command on most Unix stations
to give the output an amount of linewrapping.
Eng2c is written not to discriminate between
newlines and other whitespace, so reformat to 
your heart's content!
</p>
<p>

3. Eng2c's output is not indented at all. 
Luckily, we have the indent command on most Unix
machines to give the output and amount of indenting.
Apropos: If you take foo.c and go through this sequence:
<pre>
0. indent -bacc -bad -bap -bbb -bc -bs -sob foo.c
1. c2eng foo.c > foo.eng
2. eng2c foo.eng > foo.eng.c
3. indent -bacc -bad -bap -bbb -bc -bs -sob foo.eng.c
4. diff -bwc foo.c foo.eng.c > big.diff
</pre>
With the indentation sufficiently exacting,
the diff file should only show differences
that indicate bugs in my script. If you run into any of those,
<a href="mailto:robert-recorde@mit.edu">email me, please.</a>
</p>
<p>

4. C2eng will take multiple input arguments
and concatenate them all into one big file
(Assuming all of the files will be parsed
correctly. This is not yet guaranteed.)
So, you can do <i>
c2eng foo.c bar.c > foobar.eng</i>
and it will DTRT.
</p>

<p>
5. Eng2c will soon have a reciprocal
capability. The result, coupled with
gzip and gunzip, will be a new form of 
a tarfile, defined by the First Amendment 
File System. You'll be able to do 
<i>zcat distribution.eng.gz | eng2c </i>
and spread out a tree.
</p>
<p>

6. C2eng so far has shown that it can deal with 
comments and preprocessor directives between
major elements (function definitions, global variables, et cetera)
and between statements. When one of these interrupts
C code at a finer spot, C2eng will barf. I'm working on a fix
to that, but it will not be easy.

</p>
<h1> <a name="wishlist">A Call for Help</a></h1>
<p>
I have a wish list: 1. If some kind soul would
patch Eng2c to fill in item 5 in the list above,
I would be much obliged. Otherwise, I'll do it
Real Soon Now (TM). 2. For a harder project, I would
like some kind soul (after contacting me first) to
help me make C2eng and Eng2c more customizable for
other styles of translation. I think I've figured 
out the fastest way to do it. 
<a href="mailto:robert-recorde@mit.edu">
Email me, please.</a>




 </td>
</tr>
<tr>
  <td valign="bottom" bgcolor="#22bb22"    >
     <a href="http://www.mit.edu/~ocschwar/">
       <img border=0 align="left" src="graphics/housedk.gif" alt="Home">
     </a>
</td>
<td>
<a href="#top">
<img align="right" src="graphics/A_up.gif" alt="To top"
     height="20" width="20" border="0">
</a>
</td>


</tr></table>
<hr>


Omri Schwarz, 
May  14, 2001
<br>


</body>
</html>


