The Artima Developer Community
Sponsored Link

Python Buzz Forum
GardenSnake language

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Andrew Dalke

Posts: 291
Nickname: dalke
Registered: Sep, 2003

Andrew Dalke is a consultant and software developer in computational chemistry and biology.
GardenSnake language Posted: Aug 29, 2006 10:13 PM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Andrew Dalke.
Original Post: GardenSnake language
Feed Title: Andrew Dalke's writings
Feed URL: http://www.dalkescientific.com/writings/diary/diary-rss.xml
Feed Description: Writings from the software side of bioinformatics and chemical informatics, with a heaping of Python thrown in for good measure.
Latest Python Buzz Posts
Latest Python Buzz Posts by Andrew Dalke
Latest Posts From Andrew Dalke's writings

Advertisement

I experimented with PyParsing but I couldn't figure out how to use it to parse an indentation-based language like Python. I gave up and tried PLY, which has an API very much like the SPARK library. After using it for a while now I prefer PLY over SPARK. Its error messages are better and its documentation was exactly right for me.

I looked around but could find no examples of how to use a lex/parser pair to parse an indentation-based language. Python uses its own specialized tokenizer and parser designed for Python and it didn't look easy to port to another parsing system. With some work I figured it out.

I ended up writing a filter (or rather three filters) between the Plex tokenizer and its parser. Plex sees all newlines and whitespace but knows to ignore them when inside of (parens) and to return only leading whitespace. My filters watch the tokenizer output stream and tweak a flag so can the tokenizer can filter out non-leading whitespace.

What took the longest time was figuring out that there are three possible indentation states in Python: INDENT not allowed, INDENT may occur, INDENT required. I only had the first and last and without the middle one I couldn't come up with a set of conditions to make it work.

I got the tokenizer mostly working using a trivial language. Python's grammar is a bit more complicated so I decided to implement a subset of Python which captures most of the indentation cases. What I eventually did was use the parser rules to create a Python AST for this new language. Let Python by my back-end. Doing that found several flaws in my logic, which I hope are all now fixed.

I used Python's woefully underdocumented "compiler" module for this. I know it just well enough to use it but not enough to help improve the documentation. There's parts of it which I just do because that's what other code does. (Eg, do I have to do syntax.check(tree)?)

I decided to call the new language GardenSnake. It's a small snake you can play with. Here's the GardenSnake code with tokenizer, filters, parser, code generator and demo all in a single 695 line file.

Here's some bullet points about GardenSnake, from the comments at the top of the file:

  • only 'def', 'return' and 'if' statements
  • 'if' only has 'then' clause (no elif nor else)
  • single-quoted strings only, content in raw format and encoded as "swapcase"
  • numbers are decimal.Decimal instances (not integers or floats)
  • no print statment; use the built-in 'print' function
  • only < > == + - / * implemented (and unary + -)
  • assignment and tuple assignment work
  • no generators of any sort
  • no ... well, no quite a lot
It wouldn't be hard to implement most of these. It's mostly a matter of time and figuring how the compiler.ast is supposed to work. (Hint: use compiler.parse to create a correct parse tree for comparison.) But I'm satified with what I've done and don't plan to touch this code again.

Here's the demo program at the end of the file


print('LET\'S TRY THIS \\OUT')
  
#Comment here
def x(a):
    print('called with',a)
    if a == 1:
        return 2
    if a*2 > 10: return 999 / 4
        # Another comment here

    return a+2*3

ints = (1, 2,
   3, 4,
5)
print('mutiline-expression', ints)

t = 4+1/3*2+6*(9-5+1)
print('predence test; should be 34+2/3:', t, t==(34+2/3))

print('numbers', 1,2,3,4,5)
if 1:
 8
 a=9
 print(x(a))

print(x(1))
print(x(2))
print(x(8),'3')
print('this is decimal', 1/5)
print('BIG DECIMAL', 1.234567891234567e12345)

and with the runtime for 'print' support the output is
--> let's try this \out
--> MUTILINE-EXPRESSION (Decimal("1"), Decimal("2"), Decimal("3"), Decimal("4"), Decimal("5"))
--> PREDENCE TEST; SHOULD BE 34+2/3: 34.66666666666666666666666667 True
--> NUMBERS 1 2 3 4 5
--> CALLED WITH 9
--> 249.75
--> CALLED WITH 1
--> 2
--> CALLED WITH 2
--> 8
--> CALLED WITH 8
--> 249.75 3
--> THIS IS DECIMAL 0.2
--> big decimal 1.234567891234567E+12345
Done

Read: GardenSnake language

Topic: A new Structured Blogging release Previous Topic   Next Topic Topic: All systems are real-time and need extra processing power

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use