The Artima Developer Community
Sponsored Link

Python Buzz Forum
On interrupting application code

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Dmitry Dvoinikov

Posts: 253
Nickname: targeted
Registered: Mar, 2006

Dmitry Dvoinikov is a software developer who believes that common sense is the best design guide
On interrupting application code Posted: Sep 10, 2012 11:38 PM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Dmitry Dvoinikov.
Original Post: On interrupting application code
Feed Title: Things That Require Further Thinking
Feed URL: http://feeds.feedburner.com/ThingsThatRequireFurtherThinking
Feed Description: Once your species has evolved language, and you have learned language, [...] and you have something to say, [...] it doesn't take much time, energy and effort to say it. The hard part of course is having something interesting to say. -- Geoffrey Miller
Latest Python Buzz Posts
Latest Python Buzz Posts by Dmitry Dvoinikov
Latest Posts From Things That Require Further Thinking

Advertisement
So, I've released a new version of Pythomnic3k framework recently. As usual, the changes that get incorporated are essentially answers to questions from real life usage. There is a few in current release too.

Now I'm thinking what to do next. Among the things that still make me uncomfortable is the real-time guarantee, or, specifically in Pythomnic3k, a guarantee that every request will return by deadline. Just anything, a failure perhaps, as soon as it won't hang.

The problem is, sometimes a developer writes something that may not look like an infinite loop, but behaves like one. A situation when execution hits a

while True:
    pass

is a disaster in any architecture. It never returns, it has to be executed, there goes a CPU. Even the perfect scheduler cannot decide that this code effectively does nothing and just prevent it from being scheduled.

In Python this is worse, because it is effectively single-threaded for CPU-bound code. And even if that would not have been so, the next request hits the same spot and before you know it your load average is over 26. Therefore a single mistake like this could bring entire service down.

There has to be a way of interrupting the application code.

And in Python there is. A thread can inject an exception to another thread's execution path and as soon as the victim gets scheduled next, that exception is thrown. This is very easy:

PyThreadState_SetAsyncExc(other_thread, Exception)

As soon as the application has a watchdog (and in Pythomnic3k there is), it could interrupt worker threads using such injection. I tried it and it worked.

Now there appear other problems.

First is the problem I could easily turn blind eye to. A code which executes an OS call cannot be interrupted this way. Easy to see, because while it does it is outside the Python scheduler's reach. Therefore

time.sleep(86400)

will still return tomorrow.

Second is the bigger problem of unpredictability. You don't know when or where the deadline hence the exception hits you. This effectively means that no code can be considered exception-safe now.

As a framework author, I could protect sensitive fragments of code by essentially "disabling interrupts" for the moments when interruption would not be convenient. So I write something like this:

current_thread.can_interrupt = True
application_code()
current_thread.can_interrupt = False

That protects the framework, but not the application code. The same developer who wrote the infinite loop in the first place could very reasonably write something like this:

lock.acquire()
try:
    foo
finally:
    lock.release()

Now consider what happens if the exception is injected after acquire but before try. Although an opening try statement actually does something non-trivial, it is never considered to be a possible source of exceptions. Paranoid as I am, I read try as noop. If try started failing, all bets are off.

Similarly, it is common to put clean up code in finally and make that code exception-safe. For example: 

d["foo"] = "I exist therefore I am"
try:
    print(d["foo"])
finally:
    del d["foo"] # this could not throw
    something very important

Now any code, however safe it looks could throw. One could write very defensive code wrapping every line in a try-finally block, but then again, it is still possible that

finally:
    try:
        del d["foo"]
    finally:
        something very important

an exception is thrown just after the second finally statement and something very important is still not executed.

As it turns out to be, we have a "programming in presence of asynchronous signals" situation here. As soon as some external mechanism could interrupt execution, you have to always account for that. This was typical when programming interrupt handlers in assembler, where much of your code were cli and sti instructions. All hell rained down if you ever forgot one.

Granted, such programming is possible and could even be considered stylish and felt elitist. But it is entirely different style from application programming in high level dynamic language. Even if facilities to disable interrupts are provided by the framework, it would require much experience and care on developer's behalf, much more that could be expected. And it will be a lot of trouble to use correctly.

Therefore I don't think I will instrument Pythomnic3k with such deadline enforcing mechanism. A developer who wishes to make his code deadline-friendly could always do it in the same way it is done now, by explicitly checking for request expiration at well defined points. Something like

while not pmnc.request.expired:
    read_data(timeout = pmnc.request.remain)

And if someone makes a mistake and the service hangs... Well, you have to be ready for that too.

Read: On interrupting application code

Topic: How to Lock Your Bike the Right Way Previous Topic   Next Topic Topic: The calibration target for the Mars Hand Lens Imager (MAHLI) instrument also includes a...

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use