Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Change Python code while it's running using a reloading loop (github.com/julvo)
187 points by julvo on Oct 18, 2019 | hide | past | favorite | 39 comments


i wrote something similar a while ago, and it gave rise to a pretty head-scratching bug... my code was doing something like

  if type(x) is Foo:
and mysteriously failed for some data around the reload. turns out that Foo objects from before the reload were using a different class object than Foo objects after the reload (which makes sense in hindsight, as Foo was redefined). fun times debugging that!


That's a pain to debug. A similar thing happened to me with reload from Python's importlib. In the case of the reloading loop I think this will only happen when redefining the type within the loop, either through a class definition or an importlib reload.


yeah, importlib.reload caused mine as well! i'm wondering if there's a way to avoid that... maybe a @reloadable class decorator that saves a reference to each object you instantiate and switches out their .__class__ upon reload? obv not great for performance, but could be useful during development (which is where i'd mostly use hot reloading anyway)


I did a talk on this recently at EuroPython. Hot reloaders are really, really really hard to implement in Python in the general case (I.e not an IPython shell). You run into corner cases like this, and it basically comes down to reloading the whole app anyway.


link plz?


https://youtu.be/IghyoR6ld60

I didn't go into as much detail as I would have liked about hot reloaders, but the tl;dr is that if you have a _single_ reference to something, then hot reloading it is somewhat simple. The problem comes when you want to hot reload a single module inside a large program.

Hot reloading a loop like this is super cool, but it only works if you're just changing constants and not defining anything. The moment you want to change the definitions of an object in the loop and hot reload it then you're in a lot of pain.

Here's the source code for IPythons autoreload magic command: https://github.com/ipython/ipython/blob/cd54f15544eee69449cc.... Take note of the Caveats section.


Would this still fail if you used the preferred idiom of

    isinstance(x, Foo)


had to check to make sure isinstance wasn't doing some magic, but yes. the whole issue is that as far as python is concerned, the pre- and post-reload Foo classes aren't related in any way except by name and module, which aren't checked by either of these methods

(aside: i know that `isinstance` is the official way, but i just don't like the way it reads – "x is an instance of y" is an infix construct and looks weird in a prefix position. i don't really use inheritance so `type(x) is y` is good enough, and it reads very naturally)


When I first read this I couldn’t think of a good use case. Then the example shown seemed like something I deal with all the time and wish I could do. But then I realized I would need to remember to add reloading to the range before I commit to the long loop I wish to change. Pretty damn cool though.

If it doesn’t add much overhead this would be amusing for something like game dev, swapping out how the game works while running it.


This is how Lisp developers have worked for decades. It's been used for games before, too, e.g., https://en.wikipedia.org/wiki/Game_Oriented_Assembly_Lisp


Like ESR said: "these languages [perl, python, ruby] subsequently become more like lisp" (paraphrasing)


Are you paraphrasing ESR paraphrasing Greenspun's Tenth Rule of Programming?

"Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp."

Ref: http://philip.greenspun.com/research/

P.S. This is not a comment on TFA. Congratulations to the author on making Python even more interactive.


I am not so sure anymore where I read it. This is interesting:

https://mail.python.org/pipermail/python-dev/2003-January/03...

It refers to this, I guess. http://www.paulgraham.com/lispfaq1.html?viewfullsite=1

"If you look at these languages in order, Java, Perl, Python, you notice an interesting pattern. At least, you notice this pattern if you are a Lisp hacker. Each one is progressively more like Lisp."

This is the basic thought. Turns out PG wrote it more explicitly than ESR.


Who is ESR?

Edit: Ah, it is https://en.wikipedia.org/wiki/Eric_S._Raymond , found after some googling


This is great work, I could not believe the fact that it keeps state (like the current epoch number, and of course all the current model state). Makes Python even more dynamic.

Nice that you could condense it to less than 100 lines of Python!


This is a great debugging tool for deep learning. Often your code might have been running for 20 hrs on 16 GPUs and you just don't want to restart from scratch again.


You don't have to without this tool, you can save and load models from/to disk already from provided APIs for frameworks/libraries (e.g. [1]). Handy for when you need to hard reset, move, and/or share data.

[1] https://pytorch.org/tutorials/beginner/saving_loading_models...


Yeah. But wouldn't it be nice / cool if the training process is interactive, reproducible, recordable and replayable?


Not sure I totally agree that this achieves those goals, but I can't argue they would be ideal goals.

>interactive

AFAIK, as you type python code it will execute with the changes incorporated in the next loop execution. That's a heavily constraint "interactive" (forward propagating changes against an active execution). I mean, wouldn't a python shell with shell history be more interactive and hit all your goals? Sure the advantage of this is you can change code based on output feedback, but if you are too slow, you have to backtrack, and if you are too fast, you just wait anyways in both cases. And if you want to add code to earlier loop executions, you are kaput for earlier loop executions since those are set in stone against a different version of the code?

> reproducible

How can you get this, i.e. reproduce any loop changes outside of IDE/editor code undo/redo stack. It's hiding all the magic of what code outputted what with the "reloading" function.

> recordable, replayable

This moves the "history" to the IDE/editor code undo/redo stack. I think this is worse, cause, what happens when you close the IDE/editor without recording which specific console output was for which version of the code?

My main gripe is this sentence in the README:

> This lets you e.g. add logging, print statistics or save the model without restarting the training and, therefore, without losing the training progress.

If the italicised is the prime concern, and the purpose of this is to solve that, I'm just saying PyTorch already lets you save/persist training data, maybe the author didn't know about it + python shell combo?


Really nice tool! I often develop in a similar pattern (though not quite as functional) using IPython and it's %autoreload magic. This looks a lot cleaner.


Could be very nice for live coding music/visuals etc.


Ah, something Common Lisp has had for over 30 years.


It'd be cool to be able to use this as a method decorator too!


Brilliant idea. That would make it useful for Keras, too, using decorated callbacks during training


Anyone else getting this error? I tried on Debian 9 and MacOS 10.14.6 after installing via pip.

  import reloading
  Traceback (most recent call last):
    File "<input>", line 1, in <module>
      import reloading
    File "/home/lucas.mccoy/.local/lib/python2.7/site-packages/reloading/__init__.py", line 1, in <module>
    from .reloading import reloading
    File "/home/lucas.mccoy/.local/lib/python2.7/site-packages/reloading/reloading.py", line 87
    exc = exc.replace('File "<string>"', f'File "{fpath}"')
                                                         ^
  SyntaxError: invalid syntax


It only works with Python 3. Thanks for reminding me that I should add a note to the Readme.


It does work in Python 2 with the following patch:

    diff --git a/reloading/reloading.py b/reloading/reloading.py
    index 1d28e2f..1a5f981 100644
    --- a/reloading/reloading.py
    +++ b/reloading/reloading.py
    @@ -84,9 +84,10 @@ def reloading(seq):
                 exec(body)
             except Exception:
                 exc = traceback.format_exc()
    -            exc = exc.replace('File "<string>"', f'File "{fpath}"')
    +            exc = exc.replace('File "<string>"', 'File "{}"'.format(fpath))
                 sys.stderr.write(exc + '\n')
    -            input('Edit the file and press return to continue with the next iteration')
    +            print('Edit the file and press return to continue with the next iteration')
    +            sys.stdin.readline()
     
         # copy locals back into the caller's locals
         for k, v in locals().items():


Brilliant, thank you. I'll patch that.


That's an f-string, so it only works with Python 3.7 and later


f-strings were introduced in Python 3.6, fyi:

https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep49...


FYI there's also reload(module) which is a built-in function. One great use-case I've come across is in developing a network proxy for game packet interception, see:

https://youtu.be/iApNzWZG-10?t=641


Django's development mode aka "runserver" behaves a lot like this (as does a ton of different Node/JavaScript systems, many of which are much more precise on what sub-components to actually re-load). It's a big win for development pacing.

Being able to loop a specific segment of code looks like a great tool for my toolbox when writing data processing code, particularly for spatial data. I will have an input dataset and be iterating on the algorithm(s) as I watch the output (often rendered in QGIS) update dynamically but requiring me to re-run the test script.


Agreed, but to clarify-- django's runserver is a simple restart process upon file change. Webservers don't have heavy startup costs, so that's fine.

Being able to modify code during a run, where it has a heavy startup cost, is a different beast. I also wonder if Django's runserver could incorporate something like `reloading`, for even speedier reloads?


This is awesome. A nice improvement would be to have the loop continue with the old code when there are errors (rather than pausing iteration).


That seems like it would be confusing. Imagine trying to track down the source of behaviour that no longer exists in your code...


How would that be possible without having to essentially keep track of two files? Nice idea though.


Basically, you just need a double buffer system:

- Always run from the back buffer

- Always scan the updated file into the front buffer

- If the front buffer is fresh AND the front buffer contains no bugs, bufferswap


Change Bourne shell code while it's running:

  $ vi ~/.functions   # fix
  $ . ~/.functions    # load latest


Python is almost as good as Lisp now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: