Python Buzz Forum - GardenSnake language

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Python Buzz Forum
GardenSnake language

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Andrew Dalke

Posts: 291
Nickname: dalke
Registered: Sep, 2003

Andrew Dalke is a consultant and software developer in computational chemistry and biology.

GardenSnake language

Posted: Aug 29, 2006 10:13 PM

This post originated from an RSS feed registered with Python Buzz by Andrew Dalke.
Original Post: GardenSnake language Feed Title: Andrew Dalke's writings Feed URL: http://www.dalkescientific.com/writings/diary/diary-rss.xml Feed Description: Writings from the software side of bioinformatics and chemical informatics, with a heaping of Python thrown in for good measure.	Latest Python Buzz Posts Latest Python Buzz Posts by Andrew Dalke Latest Posts From Andrew Dalke's writings

I experimented with PyParsing but I couldn't figure out how to use it to parse an indentation-based language like Python. I gave up and tried PLY, which has an API very much like the SPARK library. After using it for a while now I prefer PLY over SPARK. Its error messages are better and its documentation was exactly right for me.

I looked around but could find no examples of how to use a lex/parser pair to parse an indentation-based language. Python uses its own specialized tokenizer and parser designed for Python and it didn't look easy to port to another parsing system. With some work I figured it out.

I ended up writing a filter (or rather three filters) between the Plex tokenizer and its parser. Plex sees all newlines and whitespace but knows to ignore them when inside of (parens) and to return only leading whitespace. My filters watch the tokenizer output stream and tweak a flag so can the tokenizer can filter out non-leading whitespace.

What took the longest time was figuring out that there are three possible indentation states in Python: INDENT not allowed, INDENT may occur, INDENT required. I only had the first and last and without the middle one I couldn't come up with a set of conditions to make it work.

I got the tokenizer mostly working using a trivial language. Python's grammar is a bit more complicated so I decided to implement a subset of Python which captures most of the indentation cases. What I eventually did was use the parser rules to create a Python AST for this new language. Let Python by my back-end. Doing that found several flaws in my logic, which I hope are all now fixed.

I used Python's woefully underdocumented "compiler" module for this. I know it just well enough to use it but not enough to help improve the documentation. There's parts of it which I just do because that's what other code does. (Eg, do I have to do syntax.check(tree)?)

I decided to call the new language GardenSnake. It's a small snake you can play with. Here's the GardenSnake code with tokenizer, filters, parser, code generator and demo all in a single 695 line file.

Here's some bullet points about GardenSnake, from the comments at the top of the file:

only 'def', 'return' and 'if' statements
'if' only has 'then' clause (no elif nor else)
single-quoted strings only, content in raw format and encoded as "swapcase"
numbers are decimal.Decimal instances (not integers or floats)
no print statment; use the built-in 'print' function
only < > == + - / * implemented (and unary + -)
assignment and tuple assignment work
no generators of any sort
no ... well, no quite a lot

It wouldn't be hard to implement most of these. It's mostly a matter of time and figuring how the compiler.ast is supposed to work. (Hint: use compiler.parse to create a correct parse tree for comparison.) But I'm satified with what I've done and don't plan to touch this code again.

Here's the demo program at the end of the file


print('LET\'S TRY THIS \\OUT')
  
#Comment here
def x(a):
    print('called with',a)
    if a == 1:
        return 2
    if a*2 > 10: return 999 / 4
        # Another comment here

    return a+2*3

ints = (1, 2,
   3, 4,
5)
print('mutiline-expression', ints)

t = 4+1/3*2+6*(9-5+1)
print('predence test; should be 34+2/3:', t, t==(34+2/3))

print('numbers', 1,2,3,4,5)
if 1:
 8
 a=9
 print(x(a))

print(x(1))
print(x(2))
print(x(8),'3')
print('this is decimal', 1/5)
print('BIG DECIMAL', 1.234567891234567e12345)

and with the runtime for 'print' support the output is

--> let's try this \out
--> MUTILINE-EXPRESSION (Decimal("1"), Decimal("2"), Decimal("3"), Decimal("4"), Decimal("5"))
--> PREDENCE TEST; SHOULD BE 34+2/3: 34.66666666666666666666666667 True
--> NUMBERS 1 2 3 4 5
--> CALLED WITH 9
--> 249.75
--> CALLED WITH 1
--> 2
--> CALLED WITH 2
--> 8
--> CALLED WITH 8
--> 249.75 3
--> THIS IS DECIMAL 0.2
--> big decimal 1.234567891234567E+12345
Done

Read: GardenSnake language

Previous Topic

Next Topic


	Web Artima.com